diff --git a/.gitignore b/.gitignore index aaa8b111e..04b1763ad 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,4 @@ -settings_local.py -settings.py.templatec *.pyc -*.DS_Store +.idea +settings_user.py + diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 000000000..2371634f1 --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "caffe"] + path = caffe + url = https://github.com/arikpoz/caffe.git diff --git a/README.md b/README.md index f7ca84ee2..a06075777 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,19 @@ # Deep Visualization Toolbox -This is the code required to run the Deep Visualization Toolbox, as well as to generate the neuron-by-neuron visualizations using regularized optimization. -The toolbox and methods are described casually [here](http://yosinski.com/deepvis) and more formally in this paper: +This repository contains an improved version of the tool, made by Arik Poznanski. +The most notable improvements are: + * Added new visualizations: + * Activation Histograms + * Activation Correlation + * Tool usage made easier + * Reduced number of user-tuned parameters + * Support for non-sequential networks like: Inception and ResNet + * Support for Siamese networks + * Enhanced UI (Input overlays, color maps, mouse support) + * Support input source (Directory, image list, siamese image list) + * Tested on all major network architectures, including: LeNet, AlexNet, ZFNet, GoogLeNet, VGGNet and ResNet. + +The original version of the tool was first described [here](http://yosinski.com/deepvis) and more formally in this paper: * Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. [Understanding neural networks through deep visualization](http://arxiv.org/abs/1506.06579). Presented at the Deep Learning Workshop, International Conference on Machine Learning (ICML), 2015. @@ -12,155 +24,108 @@ If you find this paper or code useful, we encourage you to cite the paper. BibTe Booktitle = {Deep Learning Workshop, International Conference on Machine Learning (ICML)}, Title = {Understanding Neural Networks Through Deep Visualization}, Year = {2015}} + + +## Installation +Following are installation instruction for the new improved version of Deepvis. + $ git clone --recursive https://github.com/arikpoz/deep-visualization-toolbox.git + $ cd deep-visualization-toolbox && ./build_default.sh + +Note: there is no need to download Caffe separately, it is now a sub-module of this repository and will get downloaded and built using the above instructions. -# Features +Run the tool: -The main toolbox window looks like this, here showing a convolutional unit that responds to automobile wheels: - -![DeepVis Toolbox Screenshot bus](doc/example_caffenet-yos_bus_wheel_unit.jpg?raw=true) - -For a quick tour of the toolbox features, including what each pane of the above interface is showing, watch this [4 min YouTube video](https://www.youtube.com/watch?v=AgkfIQ4IGaM). In addition to processing images files from disk, the toolbox can run off a webcam for live network visualization **(below left)**. -The toolbox comes bundled with the default [caffenet-yos](models/caffenet-yos) model weights and pre-computed per-unit visualizations shown in the paper. Weights, but not per-unit visualizations, for [bvlc-googlenet](models/bvlc-googlenet) **(below right)** and [squeezenet](models/squeezenet) can be downloaded by scripts in their respective directories. - -[![DeepVis Toolbox Screenshot webcam](doc/example_caffenet-yos_webcam.300.jpg)](doc/example_caffenet-yos_webcam.jpg?raw=true) -[![DeepVis Toolbox Screenshot bvlc-googlenet](doc/example_bvlc-googlenet_bus.300.jpg)](doc/example_bvlc-googlenet_bus.jpg?raw=true) - -You can visualize your own model as well. However, note that the toolbox provides two rather separate sets of features; the first is easy to use with your own model, and the second is more involved: - -1. **Forward/backward prop**: Images can be run forward through the network to visualize activations, and derivatives of any unit with respect to any other unit can be computed using backprop. In addition to traditional backprop, deconv from [Zeiler and Fergus (2014)](https://scholar.google.com/scholar?q=Zeiler+Visualizing+and+understanding+convolutional+networks) is supported as a way of flowing information backwards through the network. Doing forward and backward passes works for any model that can be run in Caffe (including yours!). - -2. **Per-unit visualizations**: Three types of per-unit visualizations can be computed for a network — max image, deconv of max image, activation maximization via regularized optimization — but these visualizations must be computed *outside* the toolbox and saved as jpg. The toolbox then loads these jpgs to display alongside units as they are selected. Visualizations must be pre-computed because they are far too expensive to run live. For example, going through the 1.3m image training set to find the images causing top-9 activations took 40 hours on our system (for all units). Per-unit visualization jpgs are provided for the caffenet-yos model, but not for the bvlc-googlenet or squeezenet models (and not for yours, but you can [compute them yourself](doc/computing_per_unit_visualizations.md)). - -Summary: - -| Model | Forward/Backward prop | Per-unit visualizations | -| ------------- | ---------------- | ----------------------- | -| [caffenet-yos](models/caffenet-yos) | **easy** | **included** | -| [bvlc-googlenet](bvlc-googlenet) | **easy** | not-included, [generate](doc/computing_per_unit_visualizations.md) if desired | -| [squeezenet](models/squeezenet) | **easy** | not-included, [generate](doc/computing_per_unit_visualizations.md) if desired | -| your network | **easy** (just point to your model in `settings_local.py`) | not-included, [generate](doc/computing_per_unit_visualizations.md) if desired | - - - -# Setting up and running the toolbox - -### Step 0: Compile master branch of caffe (optional but recommended) - -Checkout the master branch of [Caffe](http://caffe.berkeleyvision.org/) and compile it on your -machine. If you've never used Caffe before, it can take a bit of time to get all the required libraries in place. Fortunately, the [installation process is well documented](http://caffe.berkeleyvision.org/installation.html). When you're installing the OpenCV dependency, install the Python bindings as well (see Step 2 below). - -Note: When compiling Caffe, you can set `CPU_ONLY := 1` in your `Makefile.config` to skip all the Cuda/GPU stuff. The Deep Visualization Toolbox can run with Caffe in either CPU or GPU mode, and it's simpler to get Caffe to compile for the first time in `CPU_ONLY` mode. If Caffe is compiled with GPU options enabled, CPU vs. GPU may be switched at runtime via a setting in `settings_local.py`. Also, cuDNN may be enabled or disabled by recompiling Caffe with or without cuDNN. - - -### Step 1: Compile the deconv-deep-vis-toolbox branch of caffe - -Instead of using the master branch of Caffe, to use the demo -you'll need the slightly modified [deconv-deep-vis-toolbox Caffe branch](https://github.com/yosinski/caffe/tree/deconv-deep-vis-toolbox) (supporting deconv and a few -extra Python bindings). Getting the branch and switching to it is easy. -Starting from your Caffe directory (that is, the directory where you've checked out Caffe, *not* the directory where you've checked out the DeepVis Toolbox), run: - - $ git remote add yosinski https://github.com/yosinski/caffe.git - $ git fetch --all - $ git checkout --track -b deconv-deep-vis-toolbox yosinski/deconv-deep-vis-toolbox - $ < edit Makefile.config to suit your system if not already done in Step 0 > - $ make clean - $ make -j - $ make -j pycaffe - -As noted above, feel free to compile in `CPU_ONLY` mode. - - - -### Step 2: Install prerequisites - -The only prerequisites beyond those required for Caffe are `python-opencv`, `scipy`, and `scikit-image`, which may be installed as follows (other install options exist as well): - -#### Ubuntu: - - $ sudo apt-get install python-opencv scipy python-skimage - -#### Mac using [homebrew](http://brew.sh/): - -Install `python-opencv` using one of the following two lines, depending on whether you want to compile using Intel TBB to enable parallel operations: - - $ brew install opencv - $ brew install --with-tbb opencv - -Install `scipy` either with OpenBLAS... - - $ brew install openblas - $ brew install --with-openblas scipy - -...or without it + $ ./run_toolbox.py - $ brew install scipy +Once the toolbox is running, push 'h' to show a help screen. + +## Loading a New Model -And install `scikit-image` using pip: +1. Define a model settings file: settings_your_model.py +```python +# network definition +caffevis_deploy_prototxt = '../your-model-deploy.prototxt' +caffevis_network_weights = '../your-model-weights.caffemodel' +caffevis_data_mean = '../your-model-mean.npy' - $ pip install scikit-image +# input configuration +static_files_dir = '../input_images_folder' -You may have already installed the `python-opencv` bindings as part of the Caffe setup process. If `import cv2` works from Python, then you're all set. Similarly for `import scipy` and `import skimage`. +# output configuration +caffevis_outputs_dir = '../outputs' +layers_to_output_in_offline_scripts = ['conv1','conv2', ..., 'fc1'] +``` +2. Define a user settings file: settings_user.py +```python +# GPU configuration +#caffevis_mode_gpu = False +caffevis_gpu_id = 2 -### Step 3: Download and configure Deep Visualization Toolbox code +# select model to load +model_to_load = 'your_model' +#model_to_load = 'previous_model' +``` -You can put it wherever you like: +## Basic Layout - $ git clone https://github.com/yosinski/deep-visualization-toolbox - $ cd deep-visualization-toolbox +![Basic layout](doc/basic-layout.png) -The settings in the latest version of the toolbox (February 2016) work a bit differently than in earlier versions (April 2015). If you have the latest version (recommended!), -the minimal steps are to create a `settings_local.py` file using the template for the default `caffenet-yos` model: - $ cp models/caffenet-yos/settings_local.template-caffenet-yos.py settings_local.py +## Activation Histograms -And then edit the `settings_local.py` file to make the `caffevis_caffe_root` variable point to the directory where you've compiled caffe in Step 1: +* Helps to study the activity of a channel over a set of inputs. - $ < edit settings_local.py > +* Given a dataset of N input images, we compute the activation of each channel over the dataset and histogram the corresponding values. -*Note on settings:* Settings are now split into two files: a versioned `settings.py` file that provides documentation and default values for all settings and an unversioned `settings_local.py` file. This latter file allows you to override any default setting to tailor the toolbox to your specific setup (Caffe path, CPU vs. GPU, webcam device, etc) and model (model weights, prototxt, sizes of the various panels shown in the toolbox, etc). This also makes it easy to distribute settings tweaks alongside models: for example, `models/bvlc-googlenet/settings_local.template-bvlc-googlenet.py` includes the appropriate window pane sizes and so on for the `bvlc-googlenet` model. To load a new model, just change the details in `settings_local.py`, perhaps by copying from the included template. +1. Detect inactive channels: +![Detect inactive channels](doc/detect-inactive-channels.png) -Finally, download the default model weights and corresponding top-9 visualizations saved as jpg (downloads a 230MB model and 1.1GB of jpgs to show as visualization): +2. Detect inactive layers: +![Detect inactive layers](doc/detect-inactive-layers.png) - $ cd models/caffenet-yos/ - $ ./fetch.sh - $ cd ../.. +* The behavior seen on the right, is clearly an indication of a problem in the training process. Since most of the channels are inactive and effectively the model capacity is reduced. +* A few reasons for this behavior: + * A constant zero ReLU activation, a.k.a. “dead” ReLU. + * Poor network initialization. + +## Activation Correlation +* Seek correlations between activation values of different channels in the same layer to check network capacity usage. -### Step 4: Run it! +* On the left, there is a healthy correlation matrix, where the channels are completely uncorrelated. On the right, there is a correlation matrix with all the channels either highly or inversely correlated. -Simple: +![Activation correlation](doc/activation-correlation.png) - $ ./run_toolbox.py +* The capacity utilization of the network is relatively low. Increase in number of parameters won't improve performance. -Once the toolbox is running, push 'h' to show a help screen. You can also have a look at `bindings.py` to see what the various keys do. If the window is too large or too small for your screen, set the `global_scale` and `global_font_size` variables in `settings_local.py` to values smaller or larger than 1.0. +## Maximal Input +* For each channel we find the image patch from our dataset that has the highest activation. -# Troubleshooting +![Maximal input](doc/maximal-input.png) -If you have any problems running the Deep Vis Toolbox, here are a few things to try: - * Make sure you can compile the master branch of Caffe (Step 0 above)! If you can't, see the [detailed compilation instructions for Caffe](http://caffe.berkeleyvision.org/installation.html). If you encounter issues, the [caffe-users](https://groups.google.com/forum/#!forum/caffe-users) mailing list is a good place to look for solutions others have found. - * Try using the `dev` branch of this toolbox instead of `master` (`git checkout dev`). Sometimes it's a little more up to date. - * If you get an error (`AttributeError: 'Classifier' object has no attribute 'backward_from_layer'`) when switching to backprop or deconv modes, it's because your compiled branch of Caffe does not have the necessary Python bindings for backprop/deconv. Follow the directions in "Step 1: Compile the deconv-deep-vis-toolbox branch of caffe" above. - * If the backprop pane in the lower left is just gray, it's probably because backprop and deconv are producing all zeros. By default, Caffe won't compute derivatives at the data layer, because they're not needed to update parameters. The fix is simple: just add `force_backward: true` to your network prototxt, [like this](https://github.com/yosinski/deep-visualization-toolbox/blob/master/models/caffenet-yos/caffenet-yos-deploy.prototxt#L7). - * If the toolbox runs but the keys don't respond as expected, this may be because keys behave differently on different platforms. Run the `test_keys.py` script to test behavior on your system. - * If none of that helps, feel free to [email me](http://yosinski.com/) or [submit an issue](https://github.com/yosinski/deep-visualization-toolbox/issues). I might have left out an important detail here or there :). +## Maximal Optimized +* Using a regularized optimization process we approximate for each channel the image that empirically has the highest activation +![Maximal optimized](doc/maximal-optimized.png) -# Other ways of running the toolbox -If running the toolbox on a local Mac or Linux machine isn't working for you, you might want to try one of these other options: +## Backprop Modes - * John Moeller has put together a [Docker container for the toolbox](https://github.com/fishcorn/dvtb-container). This should even work on Windows! (confirmation needed) +* Backprop visualization is basically a regular backprop step that continues to the pixel level. It provides an easy way to study the influence of each pixel on the network decision. - * If you're desperate, it's also possible to [run the toolbox on Amazon EC2](doc/deep-vis-on-aws.md), but display will be much slower and images can be loaded only from file (not from webcam). +* Original toolbox supports only ZF-deconv and vanilla backprop. +* Our enhanced version supports also guided backpropagation that provides better localization. +* Guided backpropagation: Gradients are propagated back trough the ReLU activation only if the forward activation and the gradient are positive. +![Guided backprop](doc/guided-backprop.png) diff --git a/app_base.py b/app_base.py index 255855c22..a119662e7 100644 --- a/app_base.py +++ b/app_base.py @@ -6,13 +6,17 @@ class BaseApp(object): def __init__(self, settings, key_bindings): self.debug_level = 0 - def handle_input(self, input_image, panes): + def handle_input(self, input_image, input_label, input_filename, panes): pass def handle_key(self, key, panes): '''Handle key and return either key (to let someone downstream handle it) or None (if this app handled it)''' pass + def handle_mouse_left_click(self, x, y, flags, param, panes): + '''Handle mouse events''' + pass + def redraw_needed(self, key, panes): '''App should return whether or not its internal state has been updated (perhaps in response to handle_key, handle_input, @@ -28,7 +32,7 @@ def draw_help(self, panes): '''Tells the app to draw its help screen in the given pane. No return necessary.''' pass - def start(self): + def start(self, live_vis): '''Notify app to start, possibly creating any necessary threads''' pass diff --git a/bindings.py b/bindings.py index 7de7f9dfd..651e6efa3 100644 --- a/bindings.py +++ b/bindings.py @@ -101,19 +101,29 @@ def get_key_help(self, tag): _.add('zoom_mode', 'z', 'Cycle zooming through {currently selected unit, backprop results, none}') -_.add('pattern_mode', 's', - 'Toggle overlay of preferred input pattern (regularized optimized images)') +_.add('next_pattern_mode', 's', + 'Cycle channels overlay (max opt, max input, weights hist, max hist, correlation, off)') +_.add('prev_pattern_mode', 'S', + 'Cycle patterns overlay (max opt, max input, weights hist, max hist, correlation, off)') +_.add('pattern_first_only', '1', + 'Toggle pattern loading first image only or loading all available images') -_.add('ez_back_mode_loop', 'b', - 'Cycle through a few common backprop/deconv modes') +_.add('next_ez_back_mode_loop', 'b', + 'Cycle through backprop modes (grad, deconv zf, deconv gb, off)') +_.add('prev_ez_back_mode_loop', 'B', + 'Cycle through backprop modes (grad, deconv zf, deconv gb, off)') _.add('freeze_back_unit', 'd', 'Freeze the bprop/deconv origin to be the currently selected unit') _.add('show_back', 'a', 'Toggle between showing forward activations and back/deconv diffs') -_.add('back_mode', 'n', - '(expert) Change back mode directly.') -_.add('back_filt_mode', 'm', - '(expert) Change back output filter directly.') +_.add('next_back_view_option', 'n', + 'Cycle through backprop view options (raw, gray, norm, normblur, sum>0, histogram)') +_.add('prev_back_view_option', 'N', + 'Cycle through backprop view options (raw, gray, norm, normblur, sum>0, histogram)') +_.add('next_color_map', 'm', + 'Cycle through colormaps options (grayscale, jet, plasma)') +_.add('prev_color_map', 'M', + 'Cycle through colormaps options (grayscale, jet, plasma)') _.add('boost_gamma', 't', 'Boost contrast using gamma correction') @@ -124,4 +134,15 @@ def get_key_help(self, tag): _.add('toggle_unit_jpgs', '9', 'Turn on or off display of loaded jpg visualization') +_.add('siamese_view_mode', 'v', + 'Cycle between siamese view modes {first image, second image, both images}') + +_.add('toggle_maximal_score', 'r', + 'Toggle showing maximal score overlays {on, off}') +_.add('next_input_overlay', 'y', + 'Cycle input overlay {off, over active, over inactive}') +_.add('prev_input_overlay', 'Y', + 'Cycle input overlay {off, over active, over inactive}') + + bindings = _ diff --git a/build_default.sh b/build_default.sh new file mode 100755 index 000000000..7ee60c4af --- /dev/null +++ b/build_default.sh @@ -0,0 +1,12 @@ +#!/usr/bin/env bash +cd caffe +# fix issue with caffe build from source in ubuntu, see more details here: https://github.com/BVLC/caffe/issues/2347 +find . -type f -exec sed -i -e 's^"hdf5.h"^"hdf5/serial/hdf5.h"^g' -e 's^"hdf5_hl.h"^"hdf5/serial/hdf5_hl.h"^g' '{}' \; + +# if Makefile.config has already existed, then don't overwrite it +if [ ! -e Makefile.config ]; then + cp Makefile.config.example Makefile.config +fi +make all pycaffe +cd .. +cp settings_user.py.example settings_user.py diff --git a/caffe b/caffe new file mode 160000 index 000000000..7fec25bad --- /dev/null +++ b/caffe @@ -0,0 +1 @@ +Subproject commit 7fec25bad4405fb97b9a48d7307d8a8bb8bacae9 diff --git a/caffe_misc.py b/caffe_misc.py new file mode 100644 index 000000000..b0e5f586d --- /dev/null +++ b/caffe_misc.py @@ -0,0 +1,265 @@ +#! /usr/bin/env python + +import skimage.io +import numpy as np +from image_misc import norm01c + + +def shownet(net): + '''Print some stats about a net and its activations''' + + print '%-41s%-31s%s' % ('', 'acts', 'act diffs') + print '%-45s%-31s%s' % ('', 'params', 'param diffs') + for k, v in net.blobs.items(): + if k in net.params: + params = net.params[k] + for pp, blob in enumerate(params): + if pp == 0: + print ' ', 'P: %-5s'%k, + else: + print ' ' * 11, + print '%-32s' % repr(blob.data.shape), + print '%-30s' % ('(%g, %g)' % (blob.data.min(), blob.data.max())), + print '(%g, %g)' % (blob.diff.min(), blob.diff.max()) + print '%-5s'%k, '%-34s' % repr(v.data.shape), + print '%-30s' % ('(%g, %g)' % (v.data.min(), v.data.max())), + print '(%g, %g)' % (v.diff.min(), v.diff.max()) + + +class RegionComputer(object): + '''Computes regions of possible influcence from higher layers to lower layers.''' + + @staticmethod + def region_converter(top_slice, filter_width=(1, 1), stride=(1, 1), pad=(0, 0)): + ''' + Works for conv or pool + + vector ConvolutionLayer::JBY_region_of_influence(const vector& slice) { + + CHECK_EQ(slice.size(), 4) << "slice must have length 4 (ii_start, ii_end, jj_start, jj_end)"; + + // Crop region to output size + + vector sl = vector(slice); + + sl[0] = max(0, min(height_out_, slice[0])); + + sl[1] = max(0, min(height_out_, slice[1])); + + sl[2] = max(0, min(width_out_, slice[2])); + + sl[3] = max(0, min(width_out_, slice[3])); + + vector roi; + + roi.resize(4); + + roi[0] = sl[0] * stride_h_ - pad_h_; + + roi[1] = (sl[1]-1) * stride_h_ + kernel_h_ - pad_h_; + + roi[2] = sl[2] * stride_w_ - pad_w_; + + roi[3] = (sl[3]-1) * stride_w_ + kernel_w_ - pad_w_; + + return roi; + +} + ''' + assert len(top_slice) == 4 + assert len(filter_width) == 2 + assert len(stride) == 2 + assert len(pad) == 2 + + # Crop top slice to allowable region + top_slice = [ss for ss in top_slice] # Copy list or array -> list + + bot_slice = [-123] * 4 + + bot_slice[0] = top_slice[0] * stride[0] - pad[0] + bot_slice[1] = top_slice[1] * stride[0] - pad[0] + filter_width[0] - 1 + bot_slice[2] = top_slice[2] * stride[1] - pad[1] + bot_slice[3] = top_slice[3] * stride[1] - pad[1] + filter_width[1] - 1 + + return bot_slice + + + @staticmethod + def merge_regions(region1, region2): + + region1_x_start, region1_x_end, region1_y_start, region1_y_end = region1 + region2_x_start, region2_x_end, region2_y_start, region2_y_end = region2 + + merged_x_start = min(region1_x_start, region2_x_start) + merged_x_end = max(region1_x_end, region2_x_end) + merged_y_start = min(region1_y_start, region2_y_start) + merged_y_end = max(region1_y_end, region2_y_end) + + merged_region = (merged_x_start, merged_x_end, merged_y_start, merged_y_end) + + return merged_region + + + @staticmethod + def convert_region_dag(settings, from_layer, to_layer, region): + + step_region = None + + layer_def = settings._layer_name_to_record[from_layer] if from_layer in settings._layer_name_to_record else None + + # do single step to convert according to from_layer + if not layer_def: + # fallback to doing nothing + step_region = region + + else: + + if layer_def.type in ['Convolution', 'Pooling']: + step_region = RegionComputer.region_converter(region, layer_def.filter, layer_def.stride, layer_def.pad) + + else: + # fallback to doing nothing + step_region = region + + if from_layer == to_layer: + return step_region + + # handle the rest + total_region = None + + if layer_def is not None: + for parent_layer in layer_def.parents: + + # skip inplace layers + if len(parent_layer.tops) == 1 and len(parent_layer.bottoms) == 1 and parent_layer.tops[0] == parent_layer.bottoms[0]: + continue + + # calculate convert_region_dag on each one + current_region = RegionComputer.convert_region_dag(settings, parent_layer.name, to_layer, step_region) + + # aggregate results + if total_region is None: + total_region = current_region + else: + total_region = RegionComputer.merge_regions(total_region, current_region) + + if total_region is None: + return step_region + + return total_region + + +def save_caffe_image(img, filename, autoscale = True, autoscale_center = None): + '''Takes an image in caffe format (01) or (c01, BGR) and saves it to a file''' + if len(img.shape) == 2: + # upsample grayscale 01 -> 01c + img = np.tile(img[:,:,np.newaxis], (1,1,3)) + else: + img = img[::-1].transpose((1,2,0)) + if autoscale_center is not None: + img = norm01c(img, autoscale_center) + elif autoscale: + img = img.copy() + img -= img.min() + img *= 1.0 / (img.max() + 1e-10) + skimage.io.imsave(filename, img) + + +def layer_name_to_top_name(net, layer_name): + + if net.top_names.has_key(layer_name) and len(net.top_names[layer_name]) >= 1: + return net.top_names[layer_name][0] + + else: + return None + +def get_max_data_extent(net, settings, layer_name, is_spatial): + '''Gets the maximum size of the data layer that can influence a unit on layer.''' + + data_size = net.blobs['data'].data.shape[2:4] # e.g. (227,227) for fc6,fc7,fc8,prop + + if is_spatial: + top_name = layer_name_to_top_name(net, layer_name) + conv_size = net.blobs[top_name].data.shape[2:4] # e.g. (13,13) for conv5 + layer_slice_middle = (conv_size[0]/2,conv_size[0]/2+1, conv_size[1]/2,conv_size[1]/2+1) # e.g. (6,7,6,7,), the single center unit + data_slice = RegionComputer.convert_region_dag(settings, layer_name, 'input', layer_slice_middle) + data_slice_size = data_slice[1]-data_slice[0], data_slice[3]-data_slice[2] # e.g. (163, 163) for conv5 + # crop data slice size to data size + data_slice_size = min(data_slice_size[0], data_size[0]), min(data_slice_size[1], data_size[1]) + return data_slice_size + else: + # Whole data region + return data_size + + +def compute_data_layer_focus_area(is_spatial, ii, jj, settings, layer_name, size_ii, size_jj, data_size_ii, data_size_jj): + + if is_spatial: + + # Compute the focus area of the data layer + layer_indices = (ii, ii + 1, jj, jj + 1) + + data_indices = RegionComputer.convert_region_dag(settings, layer_name, 'input', layer_indices) + data_ii_start, data_ii_end, data_jj_start, data_jj_end = data_indices + + # safe guard edges + data_ii_start = max(data_ii_start, 0) + data_jj_start = max(data_jj_start, 0) + data_ii_end = min(data_ii_end, data_size_ii) + data_jj_end = min(data_jj_end, data_size_jj) + + touching_imin = (data_ii_start == 0) + touching_jmin = (data_jj_start == 0) + + # Compute how much of the data slice falls outside the actual data [0,max] range + ii_outside = size_ii - (data_ii_end - data_ii_start) # possibly 0 + jj_outside = size_jj - (data_jj_end - data_jj_start) # possibly 0 + + if touching_imin: + out_ii_start = ii_outside + out_ii_end = size_ii + else: + out_ii_start = 0 + out_ii_end = size_ii - ii_outside + if touching_jmin: + out_jj_start = jj_outside + out_jj_end = size_jj + else: + out_jj_start = 0 + out_jj_end = size_jj - jj_outside + + else: + data_ii_start, out_ii_start, data_jj_start, out_jj_start = 0, 0, 0, 0 + data_ii_end, out_ii_end, data_jj_end, out_jj_end = size_ii, size_ii, size_jj, size_jj + + return [out_ii_start, out_ii_end, out_jj_start, out_jj_end, data_ii_start, data_ii_end, data_jj_start, data_jj_end] + + +def extract_patch_from_image(data, net, selected_input_index, settings, + data_ii_end, data_ii_start, data_jj_end, data_jj_start, + out_ii_end, out_ii_start, out_jj_end, out_jj_start, size_ii, size_jj): + if settings.is_siamese: + + # input is first image so select first 3 channels + if selected_input_index == 0: + + out_arr = np.zeros((3, size_ii, size_jj), dtype='float32') + out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = data[0:3, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + # input is second image so select second 3 channels + elif selected_input_index == 1: + out_arr = np.zeros((3, size_ii, size_jj), dtype='float32') + out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = data[3:6, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + # input is both images so select concatenate data horizontally + elif selected_input_index == -1: + + if settings.siamese_input_mode == 'concat_channelwise': + out_arr = np.zeros((3, size_ii, size_jj * 2), dtype='float32') + out_arr[:, out_ii_start:out_ii_end, (0 + out_jj_start):(0 + out_jj_end)] = data[0:3, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + out_arr[:, out_ii_start:out_ii_end, (size_jj + out_jj_start):(size_jj + out_jj_end)] = data[3:6, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + elif settings.siamese_input_mode == 'concat_along_width': + out_arr = np.zeros((3, size_ii, size_jj), dtype='float32') + out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = data[:, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + + else: + print "Error: invalid value for selected_input_index (", selected_input_index, ")" + else: + out_arr = np.zeros((3, size_ii, size_jj), dtype='float32') + out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = data[:, + data_ii_start:data_ii_end, + data_jj_start:data_jj_end] + return out_arr diff --git a/caffevis/app.py b/caffevis/app.py index 582da50f8..0c9fb63af 100644 --- a/caffevis/app.py +++ b/caffevis/app.py @@ -1,23 +1,36 @@ #! /usr/bin/env python # -*- coding: utf-8 -import sys -import os +# add parent folder to search path, to enable import of core modules like settings +import os,sys,inspect + +currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) +parentdir = os.path.dirname(currentdir) +sys.path.insert(0,parentdir) + import cv2 import numpy as np -import time import StringIO -from misc import WithTimer +from find_maxes.find_max_acts import load_max_tracker_from_file +import find_maxes.max_tracker +sys.modules['max_tracker'] = find_maxes.max_tracker + +from misc import WithTimer, mkdir_p from numpy_cache import FIFOLimitedArrayCache from app_base import BaseApp -from image_misc import norm01, norm01c, norm0255, tile_images_normalize, ensure_float01, tile_images_make_tiles, ensure_uint255_and_resize_to_fit, get_tiles_height_width, get_tiles_height_width_ratio +from image_misc import norm01, norm01c, tile_images_normalize, ensure_float01, tile_images_make_tiles, \ + ensure_uint255_and_resize_to_fit, resize_without_fit, ensure_uint255, \ + caffe_load_image, ensure_uint255_and_resize_without_fit, array_histogram, fig2data from image_misc import FormattedString, cv2_typeset_text, to_255 from caffe_proc_thread import CaffeProcThread -from jpg_vis_loading_thread import JPGVisLoadingThread -from caffevis_app_state import CaffeVisAppState -from caffevis_helper import get_pretty_layer_name, read_label_file, load_sprite_image, load_square_sprite_image, check_force_backward_true - +from caffevis_app_state import CaffeVisAppState, SiameseViewMode, PatternMode, BackpropMode, BackpropViewOption, \ + ColorMapOption, InputOverlayOption +from caffevis_helper import get_pretty_layer_name, read_label_file, load_sprite_image, load_square_sprite_image, \ + set_mean, get_image_from_files +from caffe_misc import layer_name_to_top_name, save_caffe_image +from siamese_helper import SiameseHelper +from settings_misc import load_network, get_receptive_field class CaffeVisApp(BaseApp): @@ -25,95 +38,26 @@ class CaffeVisApp(BaseApp): def __init__(self, settings, key_bindings): super(CaffeVisApp, self).__init__(settings, key_bindings) + print 'Got settings', settings self.settings = settings self.bindings = key_bindings - self._net_channel_swap = settings.caffe_net_channel_swap + self.net, self._data_mean = load_network(settings) + + # set network batch size to 1 + current_input_shape = self.net.blobs[self.net.inputs[0]].shape + current_input_shape[0] = 1 + self.net.blobs[self.net.inputs[0]].reshape(*current_input_shape) + self.net.reshape() + + self._net_channel_swap = settings._calculated_channel_swap if self._net_channel_swap is None: self._net_channel_swap_inv = None else: self._net_channel_swap_inv = tuple([self._net_channel_swap.index(ii) for ii in range(len(self._net_channel_swap))]) - self._range_scale = 1.0 # not needed; image already in [0,255] - - # Set the mode to CPU or GPU. Note: in the latest Caffe - # versions, there is one Caffe object *per thread*, so the - # mode must be set per thread! Here we set the mode for the - # main thread; it is also separately set in CaffeProcThread. - sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) - import caffe - if settings.caffevis_mode_gpu: - caffe.set_mode_gpu() - print 'CaffeVisApp mode (in main thread): GPU' - else: - caffe.set_mode_cpu() - print 'CaffeVisApp mode (in main thread): CPU' - self.net = caffe.Classifier( - settings.caffevis_deploy_prototxt, - settings.caffevis_network_weights, - mean = None, # Set to None for now, assign later # self._data_mean, - channel_swap = self._net_channel_swap, - raw_scale = self._range_scale, - ) - - if isinstance(settings.caffevis_data_mean, basestring): - # If the mean is given as a filename, load the file - try: - - filename, file_extension = os.path.splitext(settings.caffevis_data_mean) - if file_extension == ".npy": - # load mean from numpy array - self._data_mean = np.load(settings.caffevis_data_mean) - print "Loaded mean from numpy file, data_mean.shape: ", self._data_mean.shape - - elif file_extension == ".binaryproto": - - # load mean from binary protobuf file - blob = caffe.proto.caffe_pb2.BlobProto() - data = open(settings.caffevis_data_mean, 'rb').read() - blob.ParseFromString(data) - self._data_mean = np.array(caffe.io.blobproto_to_array(blob)) - self._data_mean = np.squeeze(self._data_mean) - print "Loaded mean from binaryproto file, data_mean.shape: ", self._data_mean.shape - - else: - # unknown file extension, trying to load as numpy array - self._data_mean = np.load(settings.caffevis_data_mean) - print "Loaded mean from numpy file, data_mean.shape: ", self._data_mean.shape - - except IOError: - print '\n\nCound not load mean file:', settings.caffevis_data_mean - print 'Ensure that the values in settings.py point to a valid model weights file, network' - print 'definition prototxt, and mean. To fetch a default model and mean file, use:\n' - print '$ cd models/caffenet-yos/' - print '$ ./fetch.sh\n\n' - raise - input_shape = self.net.blobs[self.net.inputs[0]].data.shape[-2:] # e.g. 227x227 - # Crop center region (e.g. 227x227) if mean is larger (e.g. 256x256) - excess_h = self._data_mean.shape[1] - input_shape[0] - excess_w = self._data_mean.shape[2] - input_shape[1] - assert excess_h >= 0 and excess_w >= 0, 'mean should be at least as large as %s' % repr(input_shape) - self._data_mean = self._data_mean[:, (excess_h/2):(excess_h/2+input_shape[0]), - (excess_w/2):(excess_w/2+input_shape[1])] - elif settings.caffevis_data_mean is None: - self._data_mean = None - else: - # The mean has been given as a value or a tuple of values - self._data_mean = np.array(settings.caffevis_data_mean) - # Promote to shape C,1,1 - while len(self._data_mean.shape) < 1: - self._data_mean = np.expand_dims(self._data_mean, -1) - - #if not isinstance(self._data_mean, tuple): - # # If given as int/float: promote to tuple - # self._data_mean = tuple(self._data_mean) - if self._data_mean is not None: - self.net.transformer.set_mean(self.net.inputs[0], self._data_mean) - - check_force_backward_true(settings.caffevis_deploy_prototxt) - self.labels = None if self.settings.caffevis_labels: self.labels = read_label_file(self.settings.caffevis_labels) @@ -124,31 +68,16 @@ def __init__(self, settings, key_bindings): raise Exception('caffevis_jpg_cache_size must be at least 10MB for normal operation.') self.img_cache = FIFOLimitedArrayCache(settings.caffevis_jpg_cache_size) - self._populate_net_layer_info() - - def _populate_net_layer_info(self): - '''For each layer, save the number of filters and precompute - tile arrangement (needed by CaffeVisAppState to handle - keyboard navigation). - ''' - self.net_layer_info = {} - for key in self.net.blobs.keys(): - self.net_layer_info[key] = {} - # Conv example: (1, 96, 55, 55) - # FC example: (1, 1000) - blob_shape = self.net.blobs[key].data.shape - assert len(blob_shape) in (2,4), 'Expected either 2 for FC or 4 for conv layer' - self.net_layer_info[key]['isconv'] = (len(blob_shape) == 4) - self.net_layer_info[key]['data_shape'] = blob_shape[1:] # Chop off batch size - self.net_layer_info[key]['n_tiles'] = blob_shape[1] - self.net_layer_info[key]['tiles_rc'] = get_tiles_height_width_ratio(blob_shape[1], self.settings.caffevis_layers_aspect_ratio) - self.net_layer_info[key]['tile_rows'] = self.net_layer_info[key]['tiles_rc'][0] - self.net_layer_info[key]['tile_cols'] = self.net_layer_info[key]['tiles_rc'][1] - - def start(self): - self.state = CaffeVisAppState(self.net, self.settings, self.bindings, self.net_layer_info) + self.header_boxes = [] + self.buttons_boxes = [] + + def start(self, live_vis): + from jpg_vis_loading_thread import JPGVisLoadingThread + + self.live_vis = live_vis + self.state = CaffeVisAppState(self.net, self.settings, self.bindings, live_vis) self.state.drawing_stale = True - self.layer_print_names = [get_pretty_layer_name(self.settings, nn) for nn in self.state._layers] + self.header_print_names = [get_pretty_layer_name(self.settings, nn) for nn in self.state.get_headers()] if self.proc_thread is None or not self.proc_thread.is_alive(): # Start thread if it's not already running @@ -190,7 +119,7 @@ def quit(self): def _can_skip_all(self, panes): return ('caffevis_layers' not in panes.keys()) - def handle_input(self, input_image, panes): + def handle_input(self, input_image, input_label, input_filename, panes): if self.debug_level > 1: print 'handle_input: frame number', self.handled_frames, 'is', 'None' if input_image is None else 'Available' self.handled_frames += 1 @@ -201,8 +130,12 @@ def handle_input(self, input_image, panes): if self.debug_level > 1: print 'CaffeVisApp.handle_input: pushed frame' self.state.next_frame = input_image + self.state.next_label = input_label + self.state.next_filename = input_filename if self.debug_level > 1: print 'CaffeVisApp.handle_input: caffe_net_state is:', self.state.caffe_net_state + + self.state.last_frame = input_image def redraw_needed(self): return self.state.redraw_needed() @@ -216,7 +149,7 @@ def draw(self, panes): with self.state.lock: # Hold lock throughout drawing do_draw = self.state.drawing_stale and self.state.caffe_net_state == 'free' - #print 'CaffeProcThread.draw: caffe_net_state is:', self.state.caffe_net_state + # print 'CaffeProcThread.draw: caffe_net_state is:', self.state.caffe_net_state if do_draw: self.state.caffe_net_state = 'draw' @@ -228,6 +161,8 @@ def draw(self, panes): self._draw_control_pane(panes['caffevis_control']) if 'caffevis_status' in panes: self._draw_status_pane(panes['caffevis_status']) + if 'caffevis_buttons' in panes: + self._draw_buttons_pane(panes['caffevis_buttons']) layer_data_3D_highres = None if 'caffevis_layers' in panes: layer_data_3D_highres = self._draw_layer_pane(panes['caffevis_layers']) @@ -262,7 +197,7 @@ def _draw_prob_labels_pane(self, pane): clr_0 = to_255(self.settings.caffevis_class_clr_0) clr_1 = to_255(self.settings.caffevis_class_clr_1) - probs_flat = self.net.blobs[self.settings.caffevis_prob_layer].data.flatten() + probs_flat = self.net.blobs[layer_name_to_top_name(self.net, self.settings.caffevis_prob_layer)].data.flatten() top_5 = probs_flat.argsort()[-1:-6:-1] strings = [] @@ -292,25 +227,30 @@ def _draw_control_pane(self, pane): 'clr': to_255(self.settings.caffevis_control_clr), 'thick': self.settings.caffevis_control_thick} - for ii in range(len(self.layer_print_names)): - fs = FormattedString(self.layer_print_names[ii], defaults) - this_layer = self.state._layers[ii] - if self.state.backprop_selection_frozen and this_layer == self.state.backprop_layer: - fs.clr = to_255(self.settings.caffevis_control_clr_bp) + for ii in range(len(self.header_print_names)): + fs = FormattedString(self.header_print_names[ii], defaults) + this_layer_def = self.settings.layers_list[ii] + if self.state.backprop_selection_frozen and this_layer_def == self.state.get_current_backprop_layer_definition(): + fs.clr = to_255(self.settings.caffevis_control_clr_bp) fs.thick = self.settings.caffevis_control_thick_bp - if this_layer == self.state.layer: + if this_layer_def == self.state.get_current_layer_definition(): if self.state.cursor_area == 'top': fs.clr = to_255(self.settings.caffevis_control_clr_cursor) fs.thick = self.settings.caffevis_control_thick_cursor else: - if not (self.state.backprop_selection_frozen and this_layer == self.state.backprop_layer): + if not (self.state.backprop_selection_frozen and this_layer_def == self.state.get_current_backprop_layer_definition()): fs.clr = to_255(self.settings.caffevis_control_clr_selected) fs.thick = self.settings.caffevis_control_thick_selected strings.append(fs) - cv2_typeset_text(pane.data, strings, loc, - line_spacing = self.settings.caffevis_control_line_spacing, - wrap = True) + locy, self.header_boxes = cv2_typeset_text(pane.data, strings, loc, + line_spacing = self.settings.caffevis_control_line_spacing, + wrap = True) + + if hasattr(self.settings, 'control_pane_height'): + self.settings._calculated_control_pane_height = self.settings.control_pane_height + else: + self.settings._calculated_control_pane_height = locy - loc[1] + 4 def _draw_status_pane(self, pane): pane.data[:] = to_255(self.settings.window_background) @@ -322,90 +262,293 @@ def _draw_status_pane(self, pane): loc = self.settings.caffevis_status_loc[::-1] # Reverse to OpenCV c,r order status = StringIO.StringIO() + status2 = StringIO.StringIO() fps = self.proc_thread.approx_fps() with self.state.lock: - print >>status, 'pattern' if self.state.pattern_mode else ('back' if self.state.layers_show_back else 'fwd'), - print >>status, '%s:%d |' % (self.state.layer, self.state.selected_unit), + pattern_first_mode = "first" if self.state.pattern_first_only else "all" + if self.state.pattern_mode == PatternMode.MAXIMAL_OPTIMIZED_IMAGE: + print >> status, 'pattern(' + pattern_first_mode + ' optimized max)' + elif self.state.pattern_mode == PatternMode.MAXIMAL_INPUT_IMAGE: + print >> status, 'pattern(' + pattern_first_mode + ' input max)' + elif self.state.pattern_mode == PatternMode.WEIGHTS_HISTOGRAM: + print >> status, 'histogram(weights)' + elif self.state.pattern_mode == PatternMode.MAX_ACTIVATIONS_HISTOGRAM: + print >> status, 'histogram(maximal activations)' + elif self.state.pattern_mode == PatternMode.ACTIVATIONS_CORRELATION: + print >> status, 'correlation(maximal activations)' + elif self.state.pattern_mode == PatternMode.WEIGHTS_CORRELATION: + print >> status, 'correlation(weights)' + elif self.state.layers_show_back: + print >> status, 'back' + else: + print >> status, 'fwd' + + default_layer_name = self.state.get_default_layer_name() + print >>status, '%s:%d |' % (default_layer_name, self.state.selected_unit), if not self.state.back_enabled: print >>status, 'Back: off', else: - print >>status, 'Back: %s' % ('deconv' if self.state.back_mode == 'deconv' else 'bprop'), - print >>status, '(from %s_%d, disp %s)' % (self.state.backprop_layer, - self.state.backprop_unit, - self.state.back_filt_mode), + print >>status, 'Back: %s (%s)' % (BackpropMode.to_string(self.state.back_mode), BackpropViewOption.to_string(self.state.back_view_option)), + print >>status, '(from %s_%d)' % (self.state.get_default_layer_name(self.state.get_current_backprop_layer_definition()), self.state.backprop_unit), print >>status, '|', print >>status, 'Boost: %g/%g' % (self.state.layer_boost_indiv, self.state.layer_boost_gamma) if fps > 0: print >>status, '| FPS: %.01f' % fps + if self.state.next_label: + print >> status, '| GT Label: %s' % self.state.next_label + if self.state.extra_msg: print >>status, '|', self.state.extra_msg - self.state.extra_msg = '' - strings = [FormattedString(line, defaults) for line in status.getvalue().split('\n')] + print >> status2, 'Layer size: %s' % (self.state.get_layer_output_size_string()) + + print >> status2, '| Receptive field:', '%s' % (str(get_receptive_field(self.settings, self.net, default_layer_name))) + + print >> status2, '| Input: %s' % (str(self.state.next_filename)) + + strings_line1 = [FormattedString(line, defaults) for line in status.getvalue().split('\n')] + strings_line2 = [FormattedString(line, defaults) for line in status2.getvalue().split('\n')] + + locy, boxes = cv2_typeset_text(pane.data, strings_line1, (loc[0], loc[1] + 5), + line_spacing = self.settings.caffevis_status_line_spacing) + + locy, boxes = cv2_typeset_text(pane.data, strings_line2, (loc[0], locy), + line_spacing=self.settings.caffevis_status_line_spacing) + + def _draw_buttons_pane(self, pane): + + pane.data[:] = to_255(self.settings.window_background) + + header_defaults = {'face': getattr(cv2, self.settings.caffevis_buttons_header_face), + 'fsize': self.settings.caffevis_buttons_header_fsize, + 'clr': to_255(self.settings.caffevis_buttons_header_clr), + 'thick': self.settings.caffevis_buttons_header_thick} + normal_defaults = {'face': getattr(cv2, self.settings.caffevis_buttons_normal_face), + 'fsize': self.settings.caffevis_buttons_normal_fsize, + 'clr': to_255(self.settings.caffevis_buttons_normal_clr), + 'thick': self.settings.caffevis_buttons_normal_thick} + selected_defaults = {'face': getattr(cv2, self.settings.caffevis_buttons_selected_face), + 'fsize': self.settings.caffevis_buttons_selected_fsize, + 'clr': to_255(self.settings.caffevis_buttons_selected_clr), + 'thick': self.settings.caffevis_buttons_selected_thick} + + loc = self.settings.caffevis_buttons_loc[::-1] # Reverse to OpenCV c,r order + + text = StringIO.StringIO() + fps = self.proc_thread.approx_fps() + + lines = list() + + with self.state.lock: + lines.append([FormattedString('Input', header_defaults)]) + + file_defaults = selected_defaults if self.live_vis.input_updater.static_file_mode else normal_defaults + camera_defaults = selected_defaults if not self.live_vis.input_updater.static_file_mode else normal_defaults + + lines.append([FormattedString('File', file_defaults), FormattedString('Prev', normal_defaults), FormattedString('Next', normal_defaults)]) + lines.append([FormattedString('Camera', camera_defaults)]) + lines.append([FormattedString('', normal_defaults)]) + + activations_defaults = selected_defaults if self.state.pattern_mode == PatternMode.OFF and not self.state.layers_show_back else normal_defaults + gradients_defaults = selected_defaults if self.state.pattern_mode == PatternMode.OFF and self.state.layers_show_back else normal_defaults + max_optimized_defaults = selected_defaults if self.state.pattern_mode == PatternMode.MAXIMAL_OPTIMIZED_IMAGE else normal_defaults + max_input_defaults = selected_defaults if self.state.pattern_mode == PatternMode.MAXIMAL_INPUT_IMAGE else normal_defaults + weights_hist_defaults = selected_defaults if self.state.pattern_mode == PatternMode.WEIGHTS_HISTOGRAM else normal_defaults + act_hist_defaults = selected_defaults if self.state.pattern_mode == PatternMode.MAX_ACTIVATIONS_HISTOGRAM else normal_defaults + weights_corr_defaults = selected_defaults if self.state.pattern_mode == PatternMode.WEIGHTS_CORRELATION else normal_defaults + act_corr_defaults = selected_defaults if self.state.pattern_mode == PatternMode.ACTIVATIONS_CORRELATION else normal_defaults + lines.append([FormattedString('Modes', header_defaults)]) + lines.append([FormattedString('Activations', activations_defaults)]) + lines.append([FormattedString('Gradients', gradients_defaults)]) + lines.append([FormattedString('Maximal Optimized', max_optimized_defaults)]) + lines.append([FormattedString('Maximal Input', max_input_defaults)]) + lines.append([FormattedString('Weights Histogram', weights_hist_defaults)]) + lines.append([FormattedString('Activations Histogram', act_hist_defaults)]) + lines.append([FormattedString('Weights Correlation', weights_corr_defaults)]) + lines.append([FormattedString('Activations Correlation', act_corr_defaults)]) + lines.append([FormattedString('', normal_defaults)]) + + no_overlay_defaults = selected_defaults if self.state.input_overlay_option == InputOverlayOption.OFF else normal_defaults + over_active_defaults = selected_defaults if self.state.input_overlay_option == InputOverlayOption.OVER_ACTIVE else normal_defaults + over_inactive_defaults = selected_defaults if self.state.input_overlay_option == InputOverlayOption.OVER_INACTIVE else normal_defaults + lines.append([FormattedString('Input Overlay', header_defaults)]) + lines.append([FormattedString('No Overlay', no_overlay_defaults)]) + lines.append([FormattedString('Over Active', over_active_defaults)]) + lines.append([FormattedString('Over Inactive', over_inactive_defaults)]) + lines.append([FormattedString('', normal_defaults)]) + + backprop_no_defaults = selected_defaults if self.state.back_mode == BackpropMode.OFF else normal_defaults + backprop_gradients_defaults = selected_defaults if self.state.back_mode == BackpropMode.GRAD else normal_defaults + backprop_zf_defaults = selected_defaults if self.state.back_mode == BackpropMode.DECONV_ZF else normal_defaults + backprop_gb_defaults = selected_defaults if self.state.back_mode == BackpropMode.DECONV_GB else normal_defaults + backprop_frozen_defaults = selected_defaults if self.state.backprop_selection_frozen else normal_defaults + lines.append([FormattedString('Backprop Modes', header_defaults)]) + lines.append([FormattedString('No Backprop', backprop_no_defaults)]) + lines.append([FormattedString('Gradient', backprop_gradients_defaults)]) + lines.append([FormattedString('ZF Deconv', backprop_zf_defaults)]) + lines.append([FormattedString('Guided Backprop', backprop_gb_defaults)]) + lines.append([FormattedString('Freeze Origin', backprop_frozen_defaults)]) + lines.append([FormattedString('', normal_defaults)]) + + backview_raw_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.RAW else normal_defaults + backview_gray_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.GRAY else normal_defaults + backview_norm_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.NORM else normal_defaults + backview_normblur_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.NORM_BLUR else normal_defaults + backview_possum_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.POS_SUM else normal_defaults + backview_hist_defaults = selected_defaults if self.state.back_view_option == BackpropViewOption.HISTOGRAM else normal_defaults + lines.append([FormattedString('Backprop Views', header_defaults)]) + lines.append([FormattedString('Raw', backview_raw_defaults)]) + lines.append([FormattedString('Gray', backview_gray_defaults)]) + lines.append([FormattedString('Norm', backview_norm_defaults)]) + lines.append([FormattedString('Blurred Norm', backview_normblur_defaults)]) + lines.append([FormattedString('Sum > 0', backview_possum_defaults)]) + lines.append([FormattedString('Gradient Histogram', backview_hist_defaults)]) + lines.append([FormattedString('', normal_defaults)]) + + lines.append([FormattedString('Help', normal_defaults)]) + lines.append([FormattedString('Quit', normal_defaults)]) + + # strings_line1 = [[FormattedString(line, defaults)] for line in text.getvalue().split('\n')] + + locy, self.buttons_boxes = cv2_typeset_text(pane.data, lines, (loc[0], loc[1] + 5), + line_spacing=self.settings.caffevis_buttons_line_spacing) + + return + + + def prepare_tile_image(self, display_3D, highlight_selected, n_tiles, tile_rows, tile_cols): + + if self.state.layers_show_back and self.state.pattern_mode == PatternMode.OFF: + padval = self.settings.caffevis_layer_clr_back_background + else: + padval = self.settings.window_background + + highlights = [None] * n_tiles + if highlight_selected: + with self.state.lock: + if self.state.cursor_area == 'bottom': + highlights[self.state.selected_unit] = self.settings.caffevis_layer_clr_cursor # in [0,1] range + if self.state.backprop_selection_frozen and self.state.get_current_layer_definition() == self.state.get_current_backprop_layer_definition(): + highlights[self.state.backprop_unit] = self.settings.caffevis_layer_clr_back_sel # in [0,1] range + + _, display_2D = tile_images_make_tiles(display_3D, hw=(tile_rows, tile_cols), padval=padval, highlights=highlights) + + return display_2D - cv2_typeset_text(pane.data, strings, loc, - line_spacing = self.settings.caffevis_status_line_spacing) - def _draw_layer_pane(self, pane): '''Returns the data shown in highres format, b01c order.''' - - if self.state.layers_show_back: - layer_dat_3D = self.net.blobs[self.state.layer].diff[0] + + default_layer_name = self.state.get_default_layer_name() + + if self.state.siamese_view_mode_has_two_images(): + + if self.state.layers_show_back: + + layer_dat_3D_0, layer_dat_3D_1 = self.state.get_siamese_selected_diff_blobs(self.net) + else: + layer_dat_3D_0, layer_dat_3D_1 = self.state.get_siamese_selected_data_blobs(self.net) + + # Promote FC layers with shape (n) to have shape (n,1,1) + if len(layer_dat_3D_0.shape) == 1: + layer_dat_3D_0 = layer_dat_3D_0[:, np.newaxis, np.newaxis] + layer_dat_3D_1 = layer_dat_3D_1[:, np.newaxis, np.newaxis] + + # we don't resize the images to half the size since there is no point in doing that in FC layers + elif layer_dat_3D_0.shape[2] == 1: + # we don't resize the images to half the size since it will crash + pass + else: + # resize images to half the size + half_pane_shape = (layer_dat_3D_0.shape[1], layer_dat_3D_0.shape[2] / 2) + layer_dat_3D_0 = resize_without_fit(layer_dat_3D_0.transpose((1, 2, 0)), half_pane_shape).transpose((2, 0, 1)) + layer_dat_3D_1 = resize_without_fit(layer_dat_3D_1.transpose((1, 2, 0)), half_pane_shape).transpose((2, 0, 1)) + + # concatenate images side-by-side + layer_dat_3D = np.concatenate((layer_dat_3D_0, layer_dat_3D_1), axis=2) + else: - layer_dat_3D = self.net.blobs[self.state.layer].data[0] + if self.state.layers_show_back: + layer_dat_3D = self.state.get_single_selected_diff_blob(self.net) + else: + layer_dat_3D = self.state.get_single_selected_data_blob(self.net) + # Promote FC layers with shape (n) to have shape (n,1,1) if len(layer_dat_3D.shape) == 1: - layer_dat_3D = layer_dat_3D[:,np.newaxis,np.newaxis] + layer_dat_3D = layer_dat_3D[:, np.newaxis, np.newaxis] n_tiles = layer_dat_3D.shape[0] - tile_rows,tile_cols = self.net_layer_info[self.state.layer]['tiles_rc'] + top_name = layer_name_to_top_name(self.net, default_layer_name) + tile_rows, tile_cols = self.state.net_blob_info[top_name]['tiles_rc'] + + display_2D = None display_3D_highres = None - if self.state.pattern_mode: + is_layer_summary_loaded = False + if self.state.pattern_mode != PatternMode.OFF: # Show desired patterns loaded from disk - load_layer = self.state.layer - if self.settings.caffevis_jpgvis_remap and self.state.layer in self.settings.caffevis_jpgvis_remap: - load_layer = self.settings.caffevis_jpgvis_remap[self.state.layer] + if self.state.pattern_mode == PatternMode.MAXIMAL_OPTIMIZED_IMAGE: - - if self.settings.caffevis_jpgvis_layers and load_layer in self.settings.caffevis_jpgvis_layers: - jpg_path = os.path.join(self.settings.caffevis_unit_jpg_dir, - 'regularized_opt', load_layer, 'whole_layer.jpg') + if self.settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image': - # Get highres version - #cache_before = str(self.img_cache) - display_3D_highres = self.img_cache.get((jpg_path, 'whole'), None) - #else: - # display_3D_highres = None + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_pattern_images_original_format( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows) - if display_3D_highres is None: - try: - with WithTimer('CaffeVisApp:load_sprite_image', quiet = self.debug_level < 1): - display_3D_highres = load_square_sprite_image(jpg_path, n_sprites = n_tiles) - except IOError: - # File does not exist, so just display disabled. - pass - else: - self.img_cache.set((jpg_path, 'whole'), display_3D_highres) - #cache_after = str(self.img_cache) - #print 'Cache was / is:\n %s\n %s' % (cache_before, cache_after) - - if display_3D_highres is not None: - # Get lowres version, maybe. Assume we want at least one pixel for selection border. - row_downsamp_factor = int(np.ceil(float(display_3D_highres.shape[1]) / (pane.data.shape[0] / tile_rows - 2))) - col_downsamp_factor = int(np.ceil(float(display_3D_highres.shape[2]) / (pane.data.shape[1] / tile_cols - 2))) - ds = max(row_downsamp_factor, col_downsamp_factor) - if ds > 1: - #print 'Downsampling by', ds - display_3D = display_3D_highres[:,::ds,::ds,:] - else: - display_3D = display_3D_highres - else: - display_3D = layer_dat_3D * 0 # nothing to show + elif self.settings.caffevis_outputs_dir_folder_format == 'max_tracker_output': + + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_pattern_images_optimizer_format( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, + self.state.pattern_first_only, file_search_pattern='opt*.jpg') + + elif self.state.pattern_mode == PatternMode.MAXIMAL_INPUT_IMAGE: + + if self.settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image': + # maximal input image patterns is not implemented in original format + display_3D_highres = np.zeros((layer_dat_3D.shape[0], pane.data.shape[0], + pane.data.shape[1], + pane.data.shape[2]), dtype=np.uint8) + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + print "ERROR: patterns view with maximal input images is not implemented when settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image'" + + elif self.settings.caffevis_outputs_dir_folder_format == 'max_tracker_output': + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_pattern_images_optimizer_format( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, + self.state.pattern_first_only, file_search_pattern='maxim*.png') + + elif self.state.pattern_mode == PatternMode.WEIGHTS_HISTOGRAM: + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_weights_histograms( + self.net, default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, + show_layer_summary=self.state.cursor_area == 'top') + elif self.state.pattern_mode == PatternMode.MAX_ACTIVATIONS_HISTOGRAM: + if self.settings.caffevis_histograms_format == 'load_from_file': + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_pattern_images_optimizer_format( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, True, + file_search_pattern='max_histogram.png', + show_layer_summary=self.state.cursor_area == 'top', + file_summary_pattern='layer_inactivity.png') + + elif self.settings.caffevis_histograms_format == 'calculate_in_realtime': + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_maximal_activations_histograms( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, + show_layer_summary=self.state.cursor_area == 'top') + + elif self.state.pattern_mode == PatternMode.ACTIVATIONS_CORRELATION: + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_pattern_images_optimizer_format( + default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, True, + file_search_pattern=None, + show_layer_summary=True, + file_summary_pattern='channels_correlation.png') + + elif self.state.pattern_mode == PatternMode.WEIGHTS_CORRELATION: + display_2D, display_3D, display_3D_highres, is_layer_summary_loaded = self.load_weights_correlation( + self.net, default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, + show_layer_summary=True) + + else: + raise Exception("Invalid value of pattern mode: %d" % self.state.pattern_mode) else: # Show data from network (activations or diffs) @@ -424,9 +567,8 @@ def _draw_layer_pane(self, pane): layer_dat_3D_normalized = tile_images_normalize(layer_dat_3D, boost_indiv = self.state.layer_boost_indiv, boost_gamma = self.state.layer_boost_gamma) - #print ' ===layer_dat_3D_normalized.shape', layer_dat_3D_normalized.shape, 'layer_dat_3D_normalized dtype', layer_dat_3D_normalized.dtype, 'range', layer_dat_3D_normalized.min(), layer_dat_3D_normalized.max() - display_3D = layer_dat_3D_normalized + display_3D = layer_dat_3D_normalized # Convert to float if necessary: display_3D = ensure_float01(display_3D) @@ -438,54 +580,760 @@ def _draw_layer_pane(self, pane): display_3D = np.tile(display_3D, (1, 1, 1, 3)) # Upsample unit length tiles to give a more sane tile / highlight ratio # e.g. (1000,1,1,3) -> (1000,3,3,3) - if display_3D.shape[1] == 1: + if (display_3D.shape[1] == 1) and (display_3D.shape[2] == 1): display_3D = np.tile(display_3D, (1, 3, 3, 1)) - if self.state.layers_show_back and not self.state.pattern_mode: - padval = self.settings.caffevis_layer_clr_back_background - else: - padval = self.settings.window_background - - highlights = [None] * n_tiles - with self.state.lock: - if self.state.cursor_area == 'bottom': - highlights[self.state.selected_unit] = self.settings.caffevis_layer_clr_cursor # in [0,1] range - if self.state.backprop_selection_frozen and self.state.layer == self.state.backprop_layer: - highlights[self.state.backprop_unit] = self.settings.caffevis_layer_clr_back_sel # in [0,1] range - - _, display_2D = tile_images_make_tiles(display_3D, hw = (tile_rows,tile_cols), padval = padval, highlights = highlights) + # Upsample pair of unit length tiles to give a more sane tile / highlight ratio (occurs on siamese FC layers) + # e.g. (1000,1,2,3) -> (1000,2,2,3) + if (display_3D.shape[1] == 1) and (display_3D.shape[2] == 2): + display_3D = np.tile(display_3D, (1, 2, 1, 1)) if display_3D_highres is None: display_3D_highres = display_3D - + + + # generate 2D display by tiling the 3D images and add highlights, unless already generated + if display_2D is None: + display_2D = self.prepare_tile_image(display_3D, True, n_tiles, tile_rows, tile_cols) + + self._display_pane_based_on_zoom_mode(display_2D, display_3D_highres, is_layer_summary_loaded, pane) + + self._add_label_or_score_overlay(default_layer_name, pane) + + return display_3D_highres + + def _display_pane_based_on_zoom_mode(self, display_2D, display_3D_highres, is_layer_summary_loaded, pane): # Display pane based on layers_pane_zoom_mode state_layers_pane_zoom_mode = self.state.layers_pane_zoom_mode - assert state_layers_pane_zoom_mode in (0,1,2) + assert state_layers_pane_zoom_mode in (0, 1, 2) if state_layers_pane_zoom_mode == 0: # Mode 0: normal display (activations or patterns) - display_2D_resize = ensure_uint255_and_resize_to_fit(display_2D, pane.data.shape) - elif state_layers_pane_zoom_mode == 1: + if self.settings.caffevis_keep_aspect_ratio: + display_2D_resize = ensure_uint255_and_resize_to_fit(display_2D, pane.data.shape) + else: + display_2D_resize = ensure_uint255_and_resize_without_fit(display_2D, pane.data.shape) + + elif state_layers_pane_zoom_mode == 1 and not is_layer_summary_loaded: # Mode 1: zoomed selection - unit_data = display_3D_highres[self.state.selected_unit] - display_2D_resize = ensure_uint255_and_resize_to_fit(unit_data, pane.data.shape) - else: + display_2D_resize = self.get_processed_selected_unit(display_3D_highres, pane.data.shape, use_colored_data=False) + + elif state_layers_pane_zoom_mode == 2 and not is_layer_summary_loaded: # Mode 2: zoomed backprop pane - display_2D_resize = ensure_uint255_and_resize_to_fit(display_2D, pane.data.shape) * 0 + if self.settings.caffevis_keep_aspect_ratio: + display_2D_resize = ensure_uint255_and_resize_to_fit(display_2D, pane.data.shape) * 0 + else: + display_2D_resize = ensure_uint255_and_resize_without_fit(display_2D, pane.data.shape) * 0 + else: # any other case = zoom_mode + is_layer_summary_loaded + if self.settings.caffevis_keep_aspect_ratio: + display_2D_resize = ensure_uint255_and_resize_to_fit(display_2D, pane.data.shape) + else: + display_2D_resize = ensure_uint255_and_resize_without_fit(display_2D, pane.data.shape) pane.data[:] = to_255(self.settings.window_background) pane.data[0:display_2D_resize.shape[0], 0:display_2D_resize.shape[1], :] = display_2D_resize - - if self.settings.caffevis_label_layers and self.state.layer in self.settings.caffevis_label_layers and self.labels and self.state.cursor_area == 'bottom': + + def _add_label_or_score_overlay(self, default_layer_name, pane): + + if self.state.cursor_area == 'bottom': + # Display label annotation atop layers pane (e.g. for fc8/prob) - defaults = {'face': getattr(cv2, self.settings.caffevis_label_face), + defaults = {'face': getattr(cv2, self.settings.caffevis_label_face), 'fsize': self.settings.caffevis_label_fsize, - 'clr': to_255(self.settings.caffevis_label_clr), + 'clr': to_255(self.settings.caffevis_label_clr), 'thick': self.settings.caffevis_label_thick} - loc_base = self.settings.caffevis_label_loc[::-1] # Reverse to OpenCV c,r order - lines = [FormattedString(self.labels[self.state.selected_unit], defaults)] + loc_base = self.settings.caffevis_label_loc[::-1] # Reverse to OpenCV c,r order + + text_to_display = "" + if (self.labels) and (default_layer_name in self.settings.caffevis_label_layers): + text_to_display = self.labels[self.state.selected_unit] + " " + + if self.state.show_maximal_score: + if self.state.siamese_view_mode_has_two_images(): + if self.state.layers_show_back: + blob1, blob2 = self.state.get_siamese_selected_diff_blobs(self.net) + + if len(blob1.shape) == 1: + value1, value2 = blob1[self.state.selected_unit], blob2[self.state.selected_unit] + text_to_display += 'grad: ' + str(value1) + " " + str(value2) + else: + blob1, blob2 = self.state.get_siamese_selected_data_blobs(self.net) + + if len(blob1.shape) == 1: + value1, value2 = blob1[self.state.selected_unit], blob2[self.state.selected_unit] + text_to_display += 'act: ' + str(value1) + " " + str(value2) + + else: + if self.state.layers_show_back: + blob = self.state.get_single_selected_diff_blob(self.net) + + if len(blob.shape) == 1: + value = blob[self.state.selected_unit] + text_to_display += 'grad: ' + str(value) + else: + blob = self.state.get_single_selected_data_blob(self.net) + + if len(blob.shape) == 1: + value = blob[self.state.selected_unit] + text_to_display += 'act: ' + str(value) + + lines = [FormattedString(text_to_display, defaults)] cv2_typeset_text(pane.data, lines, loc_base) - + + def load_pattern_images_original_format(self, default_layer_name, layer_dat_3D, n_tiles, pane, + tile_cols, tile_rows): + display_2D = None + display_3D_highres = None + is_layer_summary_loaded = False + load_layer = default_layer_name + if self.settings.caffevis_jpgvis_remap and load_layer in self.settings.caffevis_jpgvis_remap: + load_layer = self.settings.caffevis_jpgvis_remap[load_layer] + if ((self.settings.caffevis_jpgvis_layers and load_layer in self.settings.caffevis_jpgvis_layers) or (self.settings.caffevis_jpgvis_layers is None)) and self.settings.caffevis_outputs_dir: + jpg_path = os.path.join(self.settings.caffevis_outputs_dir, 'regularized_opt', load_layer, 'whole_layer.jpg') + + # Get highres version + display_3D_highres = self.img_cache.get((jpg_path, 'whole'), None) + + if display_3D_highres is None: + try: + with WithTimer('CaffeVisApp:load_sprite_image', quiet=self.debug_level < 1): + display_3D_highres = load_square_sprite_image(jpg_path, n_sprites=n_tiles) + except IOError: + # File does not exist, so just display disabled. + pass + else: + if display_3D_highres is not None: + self.img_cache.set((jpg_path, 'whole'), display_3D_highres) + + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + return display_2D, display_3D, display_3D_highres, is_layer_summary_loaded + + def load_pattern_images_optimizer_format(self, default_layer_name, layer_dat_3D, n_tiles, pane, + tile_cols, tile_rows, first_only, file_search_pattern, show_layer_summary = False, file_summary_pattern = ""): + is_layer_summary_loaded = False + display_2D = None + display_3D_highres = None + load_layer = default_layer_name + if self.settings.caffevis_jpgvis_remap and load_layer in self.settings.caffevis_jpgvis_remap: + load_layer = self.settings.caffevis_jpgvis_remap[load_layer] + if (self.settings.caffevis_jpgvis_layers and load_layer in self.settings.caffevis_jpgvis_layers) or (self.settings.caffevis_jpgvis_layers is None): + + # get number of units + units_num = layer_dat_3D.shape[0] + + pattern_image_key = (self.settings.caffevis_outputs_dir, load_layer, "unit_%04d", units_num, file_search_pattern, first_only, show_layer_summary, file_summary_pattern) + + # Get highres version + display_3D_highres = self.img_cache.get(pattern_image_key, None) + + if display_3D_highres is None: + try: + + if self.settings.caffevis_outputs_dir: + resize_shape = pane.data.shape + + if show_layer_summary: + # load layer summary image + layer_summary_image_path = os.path.join(self.settings.caffevis_outputs_dir, load_layer, file_summary_pattern) + layer_summary_image = caffe_load_image(layer_summary_image_path, color=True, as_uint=True) + layer_summary_image = ensure_uint255_and_resize_without_fit(layer_summary_image, resize_shape) + display_3D_highres = layer_summary_image + display_3D_highres = np.expand_dims(display_3D_highres, 0) + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + + else: + if file_search_pattern is None: + display_3D_highres = None + else: + with WithTimer('CaffeVisApp:load_image_per_unit', quiet=self.debug_level < 1): + # load all images + display_3D_highres = self.load_image_per_unit(display_3D_highres, load_layer, units_num, first_only, resize_shape, file_search_pattern) + + except IOError: + # File does not exist, so just display disabled. + pass + else: + if display_3D_highres is not None: + self.img_cache.set(pattern_image_key, display_3D_highres) + else: + + # if layer found in cache, mark it as loaded + if show_layer_summary: + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + return display_2D, display_3D, display_3D_highres, is_layer_summary_loaded + + def load_image_per_unit(self, display_3D_highres, load_layer, units_num, first_only, resize_shape, file_search_pattern): + + # limit loading + if units_num > 1000: + print "WARNING: load_image_per_unit was asked to load %d units, aborted to avoid hang" % (units_num) + return None + + # for each neuron in layer + for unit_id in range(0, units_num): + + unit_folder_path = os.path.join(self.settings.caffevis_outputs_dir, load_layer, "unit_%04d" % (unit_id), file_search_pattern) + + try: + if unit_id % 10 == 0: + print "loading %s images for layer %s channel %d out of %d" % (file_search_pattern, load_layer, unit_id, units_num) + + unit_first_image = get_image_from_files(self.settings, unit_folder_path, False, resize_shape, first_only) + + # handle first generation of results container + if display_3D_highres is None: + unit_first_image_shape = unit_first_image.shape + display_3D_highres = np.zeros((units_num, unit_first_image_shape[0], + unit_first_image_shape[1], + unit_first_image_shape[2]), dtype=np.uint8) + + # set in result + display_3D_highres[unit_id, :, ::] = unit_first_image + + except: + print '\nAttempted to load file from %s but failed. To supress this warning, remove layer "%s" from settings.caffevis_jpgvis_layers' % \ + (unit_folder_path, load_layer) + pass return display_3D_highres + def downsample_display_3d(self, display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows): + if display_3D_highres is not None: + # Get lowres version, maybe. Assume we want at least one pixel for selection border. + row_downsamp_factor = int( + np.ceil(float(display_3D_highres.shape[1]) / (pane.data.shape[0] / tile_rows - 2))) + col_downsamp_factor = int( + np.ceil(float(display_3D_highres.shape[2]) / (pane.data.shape[1] / tile_cols - 2))) + ds = max(row_downsamp_factor, col_downsamp_factor) + if ds > 1: + # print 'Downsampling by', ds + display_3D = display_3D_highres[:, ::ds, ::ds, :] + else: + display_3D = display_3D_highres + else: + display_3D = layer_dat_3D * 0 # nothing to show + return display_3D + + + def load_weights_histograms(self, net, layer_name, layer_dat_3D, n_channels, pane, tile_cols, tile_rows, show_layer_summary): + + is_layer_summary_loaded = False + display_2D = None + display_3D = None + empty_display_3D = np.zeros(layer_dat_3D.shape + (3,)) + + pattern_image_key_3d = (layer_name, "weights_histogram", show_layer_summary, self.state.selected_unit, "3D") + pattern_image_key_2d = (layer_name, "weights_histogram", show_layer_summary, self.state.selected_unit, "2D") + + # Get highres version + display_3D_highres = self.img_cache.get(pattern_image_key_3d, None) + display_2D = self.img_cache.get(pattern_image_key_2d, None) + + if display_3D_highres is None or display_2D is None: + + pane_shape = pane.data.shape + + if not self.settings.caffevis_outputs_dir: + folder_path = None + cache_layer_weights_histogram_image_path = None + cache_details_weights_histogram_image_path = None + else: + folder_path = os.path.join(self.settings.caffevis_outputs_dir, layer_name) + cache_layer_weights_histogram_image_path = os.path.join(folder_path, 'layer_weights_histogram.png') + cache_details_weights_histogram_image_path = os.path.join(folder_path, 'details_weights_histogram.png') + + # plotting objects needed for + # 1. calculating size of results array + # 2. generating weights histogram for selected unit + # 3. generating weights histograms for all units + + import matplotlib.pyplot as plt + + fig = plt.figure(figsize=(10, 10), facecolor='white', tight_layout=False) + ax = fig.add_subplot(111) + + def calculate_weights_histogram_for_specific_unit(channel_idx, fig, ax, do_print): + + if do_print and channel_idx % 10 == 0: + print "calculating weights histogram for layer %s channel %d out of %d" % (layer_name, channel_idx, n_channels) + + # get vector of weights + weights = net.params[layer_name][0].data[channel_idx].flatten() + bias = net.params[layer_name][1].data[channel_idx] + + # create histogram + hist, bin_edges = np.histogram(weights, bins=50) + + # generate histogram image file + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + + ax.bar(center, hist, align='center', width=width, color='g') + + fig.suptitle('weights for unit %d, bias is %f' % (channel_idx, bias)) + ax.xaxis.label.set_text('weight value') + ax.yaxis.label.set_text('count') + + figure_buffer = fig2data(fig) + + display_3D_highres[channel_idx, :, ::] = figure_buffer + + ax.cla() + + + try: + + # handle generation of results container + figure_buffer = fig2data(fig) + first_shape = figure_buffer.shape + display_3D_highres = np.zeros((n_channels, first_shape[0], first_shape[1], first_shape[2]), dtype=np.uint8) + + # try load from cache + if show_layer_summary: + + # try load cache file for layer weight histogram + if cache_layer_weights_histogram_image_path and os.path.exists(cache_layer_weights_histogram_image_path): + + # load 2d image from cache file + display_2D = caffe_load_image(cache_layer_weights_histogram_image_path, color=True, as_uint=False) + display_3D_highres = np.zeros(pane_shape) + display_3D_highres = np.expand_dims(display_3D_highres, 0) + display_3D_highres[0] = display_2D + + is_layer_summary_loaded = True + + else: + + # try load cache file for details weights histogram + if cache_details_weights_histogram_image_path and os.path.exists(cache_details_weights_histogram_image_path): + + # load 2d image from cache file + display_2D = caffe_load_image(cache_details_weights_histogram_image_path, color=True, as_uint=False) + + # calculate weights histogram for selected unit + calculate_weights_histogram_for_specific_unit(self.state.selected_unit, fig, ax, do_print=False) + + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + + # generate empty highlights + display_2D_highlights_only = self.prepare_tile_image(display_3D * 0, True, n_channels, tile_rows, tile_cols) + + # if shapes are not equal, cache is invalid + if display_2D_highlights_only.shape == display_2D.shape: + # mix highlights with cached image + display_2D = (display_2D_highlights_only != 0) * display_2D_highlights_only + (display_2D_highlights_only == 0) * display_2D + else: + display_2D = None + + # if not loaded from cache, generate the data + if display_2D is None: + + # calculate weights histogram image + + # check if layer has weights at all + if not net.params.has_key(layer_name): + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + + # pattern_image_key_layer = (layer_name, "weights_histogram", True) + # pattern_image_key_details = (layer_name, "weights_histogram", False) + + # self.img_cache.set(pattern_image_key_details, display_3D_highres) + # self.img_cache.set(pattern_image_key_layer, display_3D_highres_summary) + + if show_layer_summary: + + half_pane_shape = (pane_shape[0], pane_shape[1] / 2) + + # generate weights histogram for layer + weights = net.params[layer_name][0].data.flatten() + hist, bin_edges = np.histogram(weights, bins=50) + + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + ax.bar(center, hist, align='center', width=width, color='g') + + fig.suptitle('weights for layer %s' % layer_name) + ax.xaxis.label.set_text('weight value') + ax.yaxis.label.set_text('count') + + figure_buffer = fig2data(fig) + display_3D_highres_summary_weights = ensure_uint255_and_resize_without_fit(figure_buffer, half_pane_shape) + + ax.cla() + + # generate bias histogram for layer + bias = net.params[layer_name][1].data.flatten() + hist, bin_edges = np.histogram(bias, bins=50) + + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + ax.bar(center, hist, align='center', width=width, color='g') + + fig.suptitle('bias for layer %s' % layer_name) + ax.xaxis.label.set_text('bias value') + ax.yaxis.label.set_text('count') + + figure_buffer = fig2data(fig) + display_3D_highres_summary_bias = ensure_uint255_and_resize_without_fit(figure_buffer, half_pane_shape) + + display_3D_highres_summary = np.concatenate((display_3D_highres_summary_weights, display_3D_highres_summary_bias), axis=1) + display_3D_highres_summary = np.expand_dims(display_3D_highres_summary, 0) + + display_3D_highres = display_3D_highres_summary + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + + if folder_path: + mkdir_p(folder_path) + save_caffe_image(display_2D[:,:,::-1].astype(np.float32).transpose((2,0,1)), cache_layer_weights_histogram_image_path) + else: + print "WARNING: unable to save weight histogram to cache since caffevis_outputs_dir is not set" + + else: + + # for each channel + for channel_idx in xrange(n_channels): + calculate_weights_histogram_for_specific_unit(channel_idx, fig, ax, do_print=True) + + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + + # generate display of details weights histogram image + display_2D = self.prepare_tile_image(display_3D, False, n_channels, tile_rows, tile_cols) + + if folder_path: + # save histogram image to cache + mkdir_p(folder_path) + save_caffe_image(display_2D[:,:,::-1].astype(np.float32).transpose((2,0,1)), cache_details_weights_histogram_image_path) + else: + print "WARNING: unable to save weight histogram to cache since caffevis_outputs_dir is not set" + + # generate empty highlights + display_2D_highlights_only = self.prepare_tile_image(display_3D * 0, True, n_channels, tile_rows, tile_cols) + + # mix highlights with cached image + display_2D = (display_2D_highlights_only != 0) * display_2D_highlights_only + (display_2D_highlights_only == 0) * display_2D + + except IOError: + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + # File does not exist, so just display disabled. + pass + + else: + self.img_cache.set(pattern_image_key_3d, display_3D_highres) + self.img_cache.set(pattern_image_key_2d, display_2D) + + fig.clf() + plt.close(fig) + + else: + # here we can safely assume that display_2D is not None, so we only need to check if show_layer_summary was requested + if show_layer_summary: + is_layer_summary_loaded = True + + pass + + if display_3D is None: + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + + return display_2D, display_3D, display_3D_highres, is_layer_summary_loaded + + def load_weights_correlation(self, net, layer_name, layer_dat_3D, n_channels, pane, tile_cols, tile_rows, show_layer_summary): + + is_layer_summary_loaded = False + display_2D = None + display_3D = None + empty_display_3D = np.zeros(layer_dat_3D.shape + (3,)) + + pattern_image_key_3d = (layer_name, "weights_correlation", show_layer_summary, self.state.selected_unit, "3D") + pattern_image_key_2d = (layer_name, "weights_correlation", show_layer_summary, self.state.selected_unit, "2D") + + # Get highres version + display_3D_highres = self.img_cache.get(pattern_image_key_3d, None) + display_2D = self.img_cache.get(pattern_image_key_2d, None) + + if display_3D_highres is None or display_2D is None: + + pane_shape = pane.data.shape + + if not self.settings.caffevis_outputs_dir: + folder_path = None + cache_layer_weights_correlation_image_path = None + else: + folder_path = os.path.join(self.settings.caffevis_outputs_dir, layer_name) + cache_layer_weights_correlation_image_path = os.path.join(folder_path, 'layer_weights_correlation.png') + + try: + + # try load cache file for layer weight correlation + if cache_layer_weights_correlation_image_path and os.path.exists(cache_layer_weights_correlation_image_path): + + # load 2d image from cache file + display_2D = caffe_load_image(cache_layer_weights_correlation_image_path, color=True, as_uint=False) + display_3D_highres = np.zeros(pane_shape) + display_3D_highres = np.expand_dims(display_3D_highres, 0) + display_3D_highres[0] = display_2D + + is_layer_summary_loaded = True + + # if not loaded from cache, generate the data + if display_2D is None: + + # calculate weights correlation image + + # check if layer has weights at all + if not net.params.has_key(layer_name): + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + + # skip layers with only one channel + if n_channels == 1: + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + + data_unroll = net.params[layer_name][0].data.reshape((n_channels, -1)) # Note: no copy eg (96,3025). Does nothing if not is_spatial + + corr = np.corrcoef(data_unroll) + + # fix possible NANs + corr = np.nan_to_num(corr) + np.fill_diagonal(corr, 1) + + # sort correlation matrix + indexes = np.lexsort(corr) + sorted_corr = corr[indexes, :][:, indexes] + + # plot correlation matrix + import matplotlib.pyplot as plt + + fig = plt.figure(figsize=(10, 10), facecolor='white', tight_layout=True) + plt.subplot(1, 1, 1) + plt.imshow(sorted_corr, interpolation='nearest', vmin=-1, vmax=1) + plt.colorbar() + plt.title('channels weights correlation matrix for layer %s' % (layer_name)) + figure_buffer = fig2data(fig) + plt.close() + + display_3D_highres_summary = ensure_uint255_and_resize_without_fit(figure_buffer, pane_shape) + display_3D_highres_summary = np.expand_dims(display_3D_highres_summary, 0) + display_3D_highres = display_3D_highres_summary + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + + if folder_path: + mkdir_p(folder_path) + save_caffe_image(display_2D[:,:,::-1].astype(np.float32).transpose((2,0,1)), cache_layer_weights_correlation_image_path) + else: + print "WARNING: unable to save weight correlationto cache since caffevis_outputs_dir is not set" + + self.img_cache.set(pattern_image_key_3d, display_3D_highres) + self.img_cache.set(pattern_image_key_2d, display_2D) + + except IOError: + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + # File does not exist, so just display disabled. + pass + + else: + # here we can safely assume that display_2D is not None, so we only need to check if show_layer_summary was requested + if show_layer_summary: + is_layer_summary_loaded = True + + pass + + if display_3D is None: + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + + return display_2D, display_3D, display_3D_highres, is_layer_summary_loaded + + def load_maximal_activations_histograms(self, default_layer_name, layer_dat_3D, n_tiles, pane, tile_cols, tile_rows, show_layer_summary): + + display_2D = None + empty_display_3D = np.zeros(layer_dat_3D.shape + (3,)) + + is_layer_summary_loaded = False + + maximum_activation_histogram_data_file = os.path.join(settings.caffevis_outputs_dir, 'find_max_acts_output.pickled') + pattern_image_key = (maximum_activation_histogram_data_file, default_layer_name, "max histograms", show_layer_summary) + + # Get highres version + display_3D_highres = self.img_cache.get(pattern_image_key, None) + + pane_shape = pane.data.shape + + if display_3D_highres is None: + try: + # load pickle file + net_max_tracker = load_max_tracker_from_file(maximum_activation_histogram_data_file) + + if not net_max_tracker.max_trackers.has_key(default_layer_name): + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + + # check if + if not hasattr(net_max_tracker.max_trackers[default_layer_name], 'channel_to_histogram'): + print "ERROR: file %s is missing the field channel_to_histogram, try rerun find_max_acts to generate it" % (maximum_activation_histogram_data_file) + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + + channel_to_histogram = net_max_tracker.max_trackers[default_layer_name].channel_to_histogram + + def channel_to_histogram_values(channel_idx): + + # get channel data + hist, bin_edges = channel_to_histogram[channel_idx] + + return hist, bin_edges + + display_3D_highres_list = [display_3D_highres, display_3D_highres] + + def process_channel_figure(channel_idx, fig): + figure_buffer = fig2data(fig) + + # handle first generation of results container + if display_3D_highres_list[0] is None: + first_shape = figure_buffer.shape + display_3D_highres_list[0] = np.zeros((len(channel_to_histogram), first_shape[0], + first_shape[1], + first_shape[2]), dtype=np.uint8) + + display_3D_highres_list[0][channel_idx, :, ::] = figure_buffer + pass + + def process_layer_figure(fig): + figure_buffer = fig2data(fig) + display_3D_highres_list[1] = ensure_uint255_and_resize_without_fit(figure_buffer, pane_shape) + display_3D_highres_list[1] = np.expand_dims(display_3D_highres_list[1], 0) + pass + + n_channels = len(channel_to_histogram) + find_maxes.max_tracker.prepare_max_histogram(default_layer_name, n_channels, channel_to_histogram_values, process_channel_figure, process_layer_figure) + + pattern_image_key_layer = (maximum_activation_histogram_data_file, default_layer_name, "max histograms",True) + pattern_image_key_details = (maximum_activation_histogram_data_file, default_layer_name, "max histograms",False) + + self.img_cache.set(pattern_image_key_details, display_3D_highres_list[0]) + self.img_cache.set(pattern_image_key_layer, display_3D_highres_list[1]) + + if show_layer_summary: + display_3D_highres = display_3D_highres_list[1] + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + else: + display_3D_highres = display_3D_highres_list[0] + + except IOError: + return display_2D, empty_display_3D, empty_display_3D, is_layer_summary_loaded + # File does not exist, so just display disabled. + pass + else: + + # if layer found in cache, mark it as loaded + if show_layer_summary: + display_2D = display_3D_highres[0] + is_layer_summary_loaded = True + + display_3D = self.downsample_display_3d(display_3D_highres, layer_dat_3D, pane, tile_cols, tile_rows) + return display_2D, display_3D, display_3D_highres, is_layer_summary_loaded + + def get_processed_selected_unit(self, all_units, resize_shape, use_colored_data = False): + + unit_data = all_units[self.state.selected_unit] + if self.settings.caffevis_keep_aspect_ratio: + unit_data_resize = resize_to_fit(unit_data, resize_shape) + else: + unit_data_resize = resize_without_fit(unit_data, resize_shape) + + if self.state.pattern_mode == PatternMode.OFF: + if self.state.last_frame is None: + pass + + input_image = SiameseHelper.get_image_from_frame(self.state.last_frame, self.state.settings.is_siamese, + resize_shape, self.state.siamese_view_mode) + normalized_mask = unit_data_resize + + if use_colored_data: + unit_data_resize = self.state.gray_to_colormap(unit_data_resize) + normalized_mask = np.tile(normalized_mask[:, :, np.newaxis], 3) + + if self.state.input_overlay_option == InputOverlayOption.OFF: + pass + + elif self.state.input_overlay_option == InputOverlayOption.OVER_ACTIVE: + + unit_data_resize = normalized_mask * input_image + (1 - normalized_mask) * unit_data_resize + + elif self.state.input_overlay_option == InputOverlayOption.OVER_INACTIVE: + unit_data_resize = (normalized_mask < 0.1) * input_image + (normalized_mask >= 0.1) * unit_data_resize + pass + + unit_data_resize = ensure_uint255(unit_data_resize) + return unit_data_resize + + + def _mix_input_overlay_with_colormap_old(self, unit_data, resize_shape, input_image): + + if self.settings.caffevis_keep_aspect_ratio: + unit_data_resize = ensure_uint255_and_resize_to_fit(unit_data, resize_shape) + input_image_resize = ensure_uint255_and_resize_to_fit(input_image, resize_shape) + else: + unit_data_resize = ensure_uint255_and_resize_without_fit(unit_data, resize_shape) + input_image_resize = ensure_uint255_and_resize_without_fit(input_image, resize_shape) + + normalized_mask = unit_data_resize / 255.0 + normalized_mask = np.tile(normalized_mask[:, :, np.newaxis], 3) + + colored_unit_data_resize = self.state.gray_to_colormap(unit_data_resize) + colored_unit_data_resize = ensure_uint255(colored_unit_data_resize) + if len(colored_unit_data_resize.shape) == 2: + colored_unit_data_resize = np.tile(colored_unit_data_resize[:, :, np.newaxis], 3) + + if self.state.input_overlay_option == InputOverlayOption.OFF: + pass + + elif self.state.input_overlay_option == InputOverlayOption.OVER_ACTIVE: + colored_unit_data_resize = np.array(normalized_mask * input_image_resize + (1 - normalized_mask) * colored_unit_data_resize, dtype = 'uint8') + + elif self.state.input_overlay_option == InputOverlayOption.OVER_INACTIVE: + MAGIC_THRESHOLD_NUMBER = 0.3 + colored_unit_data_resize = (normalized_mask < MAGIC_THRESHOLD_NUMBER) * input_image_resize + (normalized_mask >= MAGIC_THRESHOLD_NUMBER) * colored_unit_data_resize + pass + + return colored_unit_data_resize + + def _mix_input_overlay_with_colormap(self, unit_data, resize_shape, input_image): + + # resize + if self.settings.caffevis_keep_aspect_ratio: + input_image_resize = resize_to_fit(input_image, resize_shape) + unit_data_resize = resize_to_fit(unit_data, resize_shape) + else: + input_image_resize = resize_without_fit(input_image, resize_shape) + unit_data_resize = resize_without_fit(unit_data, resize_shape) + + sigma = 0.02 * max(unit_data_resize.shape[0:2]) + blur_unit_data_resize = cv2.GaussianBlur(unit_data_resize, (0, 0), sigma) + normalized_blur_unit_data_resize = norm01(blur_unit_data_resize) + + colored_normalized_blur_unit_data_resize = self.state.gray_to_colormap(normalized_blur_unit_data_resize) + if len(colored_normalized_blur_unit_data_resize.shape) == 2: + colored_normalized_blur_unit_data_resize = np.tile(colored_normalized_blur_unit_data_resize[:, :, np.newaxis], 3) + + if self.state.input_overlay_option == InputOverlayOption.OFF: + attMap = colored_normalized_blur_unit_data_resize + pass + + elif self.state.input_overlay_option == InputOverlayOption.OVER_ACTIVE: + MAGIC_NUMBER = 0.8 + boost_normalized_blur_unit_data_resize = normalized_blur_unit_data_resize ** MAGIC_NUMBER + boost_normalized_blur_unit_data_resize = boost_normalized_blur_unit_data_resize.reshape(boost_normalized_blur_unit_data_resize.shape + (1,)) + attMap = (boost_normalized_blur_unit_data_resize) * input_image_resize + (1 - boost_normalized_blur_unit_data_resize) * colored_normalized_blur_unit_data_resize + + elif self.state.input_overlay_option == InputOverlayOption.OVER_INACTIVE: + MAGIC_NUMBER = 0.8 + boost_normalized_blur_unit_data_resize = normalized_blur_unit_data_resize ** MAGIC_NUMBER + boost_normalized_blur_unit_data_resize = boost_normalized_blur_unit_data_resize.reshape(boost_normalized_blur_unit_data_resize.shape + (1,)) + attMap = (1 - boost_normalized_blur_unit_data_resize) * input_image_resize + (boost_normalized_blur_unit_data_resize) * colored_normalized_blur_unit_data_resize + + return attMap + + def _draw_aux_pane(self, pane, layer_data_normalized): pane.data[:] = to_255(self.settings.window_background) @@ -497,9 +1345,9 @@ def _draw_aux_pane(self, pane, layer_data_normalized): mode = 'prob_labels' if mode == 'selected': - unit_data = layer_data_normalized[self.state.selected_unit] - unit_data_resize = ensure_uint255_and_resize_to_fit(unit_data, pane.data.shape) + unit_data_resize = self.get_processed_selected_unit(layer_data_normalized, pane.data.shape, use_colored_data=False) pane.data[0:unit_data_resize.shape[0], 0:unit_data_resize.shape[1], :] = unit_data_resize + elif mode == 'prob_labels': self._draw_prob_labels_pane(pane) @@ -508,9 +1356,7 @@ def _draw_back_pane(self, pane): with self.state.lock: back_enabled = self.state.back_enabled back_mode = self.state.back_mode - back_filt_mode = self.state.back_filt_mode - state_layer = self.state.layer - selected_unit = self.state.selected_unit + back_view_option = self.state.back_view_option back_what_to_disp = self.get_back_what_to_disp() if back_what_to_disp == 'disabled': @@ -519,77 +1365,166 @@ def _draw_back_pane(self, pane): elif back_what_to_disp == 'stale': pane.data[:] = to_255(self.settings.stale_background) - else: - # One of the backprop modes is enabled and the back computation (gradient or deconv) is up to date - - grad_blob = self.net.blobs['data'].diff - - # Manually deprocess (skip mean subtraction and rescaling) - #grad_img = self.net.deprocess('data', diff_blob) - grad_blob = grad_blob[0] # bc01 -> c01 - grad_blob = grad_blob.transpose((1,2,0)) # c01 -> 01c - if self._net_channel_swap_inv is None: - grad_img = grad_blob[:, :, :] # e.g. BGR -> RGB - else: - grad_img = grad_blob[:, :, self._net_channel_swap_inv] # e.g. BGR -> RGB - - # Mode-specific processing - assert back_mode in ('grad', 'deconv') - assert back_filt_mode in ('raw', 'gray', 'norm', 'normblur') - if back_filt_mode == 'raw': - grad_img = norm01c(grad_img, 0) - elif back_filt_mode == 'gray': - grad_img = grad_img.mean(axis=2) - grad_img = norm01c(grad_img, 0) - elif back_filt_mode == 'norm': - grad_img = np.linalg.norm(grad_img, axis=2) - grad_img = norm01(grad_img) - else: - grad_img = np.linalg.norm(grad_img, axis=2) - cv2.GaussianBlur(grad_img, (0,0), self.settings.caffevis_grad_norm_blur_radius, grad_img) - grad_img = norm01(grad_img) + else: # One of the backprop modes is enabled and the back computation (gradient or deconv) is up to date - # If necessary, re-promote from grayscale to color - if len(grad_img.shape) == 2: - grad_img = np.tile(grad_img[:,:,np.newaxis], 3) + # define helper function to run processing once or twice, in case of siamese network + def run_processing_once_or_twice(resize_shape, process_image_fn): + + has_pair_inputs = False + no_spatial_info = False; + + # if selection is frozen we use the currently selected layer as target for visualization + if self.state.backprop_selection_frozen: + if self.state.siamese_view_mode_has_two_images(): + grad_blob1, grad_blob2 = self.state.get_siamese_selected_diff_blobs(self.net) + + if len(grad_blob1.shape) == 1: + no_spatial_info = True + + if len(grad_blob1.shape) == 3: + grad_blob1 = grad_blob1.transpose((1, 2, 0)) # c01 -> 01c + grad_blob2 = grad_blob2.transpose((1, 2, 0)) # c01 -> 01c + + has_pair_inputs = True + + else: + grad_blob = self.state.get_single_selected_diff_blob(self.net) + if len(grad_blob.shape) == 1: + no_spatial_info = True + if len(grad_blob.shape) == 3: + grad_blob = grad_blob.transpose((1, 2, 0)) # c01 -> 01c + + # if selection is not frozen we use the input layer as target for visualization + if (not self.state.backprop_selection_frozen) or no_spatial_info: + grad_blob = self.net.blobs['data'].diff + + grad_blob = grad_blob[0] # bc01 -> c01 + grad_blob = grad_blob.transpose((1, 2, 0)) # c01 -> 01c + + if self._net_channel_swap_inv: + grad_blob = grad_blob[:, :, self._net_channel_swap_inv] # e.g. BGR -> RGB + + # split image to image0 and image1 + if self.settings.is_siamese: + # split image to image0 and image1 + if self.settings.siamese_input_mode == 'concat_channelwise': + [grad_blob1, grad_blob2] = np.split(grad_blob, 2, axis=2) + + elif self.settings.siamese_input_mode == 'concat_along_width': + half_width = grad_blob.shape[1] / 2 + grad_blob1 = grad_blob[:, :half_width, :] + grad_blob2 = grad_blob[:, half_width:, :] + + has_pair_inputs = True + + # if siamese network, run processing twice + if self.settings.is_siamese: - if (self.settings.static_files_input_mode == "siamese_image_list") and (grad_img.shape[2] == 6): + # combine image0 and image1 + if self.state.siamese_view_mode == SiameseViewMode.FIRST_IMAGE: + # run processing on image0 + return process_image_fn(grad_blob, resize_shape, self.state.last_frame[0]) - grad_img1 = grad_img[:, :, 0:3] - grad_img2 = grad_img[:, :, 3:6] + elif self.state.siamese_view_mode == SiameseViewMode.SECOND_IMAGE: + # run processing on image1 + return process_image_fn(grad_blob, resize_shape, self.state.last_frame[1]) - half_pane_shape = (pane.data.shape[0] / 2, pane.data.shape[1]) - grad_img_disp1 = cv2.resize(grad_img1[:], half_pane_shape) - grad_img_disp2 = cv2.resize(grad_img2[:], half_pane_shape) + elif self.state.siamese_view_mode == SiameseViewMode.BOTH_IMAGES and has_pair_inputs: - grad_img_disp = np.concatenate((grad_img_disp1, grad_img_disp2), axis=1) + # resize each gradient image to half the pane size + half_pane_shape = (resize_shape[0], resize_shape[1] / 2) + # run processing on both image0 and image1 + image1 = process_image_fn(grad_blob1, half_pane_shape, self.state.last_frame[0]) + image2 = process_image_fn(grad_blob2, half_pane_shape, self.state.last_frame[1]) + + image1 = resize_without_fit(image1[:], half_pane_shape) + image2 = resize_without_fit(image2[:], half_pane_shape) + + # generate the pane image by concatenating both images + return np.concatenate((image1, image2), axis=1) + elif self.state.siamese_view_mode == SiameseViewMode.BOTH_IMAGES and not has_pair_inputs: + processed_input = self.state.convert_image_pair_to_network_input_format(self.settings, self.state.last_frame, resize_shape) + return process_image_fn(grad_blob, resize_shape, processed_input) + + else: + return process_image_fn(grad_blob, resize_shape, self.state.last_frame) + + # else, normal network, run processing once + else: + # run processing on image + return process_image_fn(grad_blob, resize_shape, self.state.last_frame) + + raise Exception("flow should not arrive here") + + if back_view_option == BackpropViewOption.RAW: + def do_raw(grad_blob, resize_shape, input_image): + if len(grad_blob.shape) == 3 and grad_blob.shape[2] != 3: + return np.zeros(resize_shape) + return norm01c(grad_blob, 0) + grad_img = run_processing_once_or_twice(pane.data.shape, do_raw) + + elif back_view_option == BackpropViewOption.GRAY: + def do_gray(grad_blob, resize_shape, input_image): + return norm01c(grad_blob.mean(axis=2), 0) + grad_img = run_processing_once_or_twice(pane.data.shape, do_gray) + + elif back_view_option == BackpropViewOption.NORM: + def do_norm(grad_blob, resize_shape, input_image): + norm_grad_blob = norm01(np.linalg.norm(grad_blob, axis=2)) + return self._mix_input_overlay_with_colormap(norm_grad_blob, resize_shape, input_image) + grad_img = run_processing_once_or_twice(pane.data.shape, do_norm) + + elif back_view_option == BackpropViewOption.NORM_BLUR: + def do_norm_blur(grad_blob, resize_shape, input_image): + grad_blob = np.linalg.norm(grad_blob, axis=2) + cv2.GaussianBlur(grad_blob, (0, 0), self.settings.caffevis_grad_norm_blur_radius, grad_blob) + norm_grad_blob = norm01(grad_blob) + return self._mix_input_overlay_with_colormap(norm_grad_blob, resize_shape, input_image) + grad_img = run_processing_once_or_twice(pane.data.shape, do_norm_blur) + + elif back_view_option == BackpropViewOption.POS_SUM: + def do_pos_sum(grad_blob, resize_shape, input_image): + grad_blob = np.maximum(grad_blob.sum(-1), 0) + norm_grad_blob = norm01(grad_blob) + return self._mix_input_overlay_with_colormap(norm_grad_blob, resize_shape, input_image) + grad_img = run_processing_once_or_twice(pane.data.shape, do_pos_sum) + + elif back_view_option == BackpropViewOption.HISTOGRAM: + def do_histogram(grad_blob, resize_shape, input_image): + return array_histogram(grad_blob, half_pane_shape, BackpropMode.to_string(back_mode)+' histogram', 'values', 'count') + + half_pane_shape = (pane.data.shape[0],pane.data.shape[1]/2,3) + grad_img = run_processing_once_or_twice(pane.data.shape, do_histogram) + + else: + raise Exception('Invalid option for back_view_option: %s' % (back_view_option)) + + # If necessary, re-promote from grayscale to color + if len(grad_img.shape) == 2: + grad_img = np.tile(grad_img[:,:,np.newaxis], 3) + + if self.settings.caffevis_keep_aspect_ratio: + grad_img_resize = ensure_uint255_and_resize_to_fit(grad_img, pane.data.shape) else: - grad_img_disp = grad_img + grad_img_resize = ensure_uint255_and_resize_without_fit(grad_img, pane.data.shape) - grad_img_resize = ensure_uint255_and_resize_to_fit(grad_img_disp, pane.data.shape) pane.data[0:grad_img_resize.shape[0], 0:grad_img_resize.shape[1], :] = grad_img_resize def _draw_jpgvis_pane(self, pane): pane.data[:] = to_255(self.settings.window_background) with self.state.lock: - state_layer, state_selected_unit, cursor_area, show_unit_jpgs = self.state.layer, self.state.selected_unit, self.state.cursor_area, self.state.show_unit_jpgs - - try: - # Some may be missing this setting - self.settings.caffevis_jpgvis_layers - except: - print '\n\nNOTE: you need to upgrade your settings.py and settings_local.py files. See README.md.\n\n' - raise - - if self.settings.caffevis_jpgvis_remap and state_layer in self.settings.caffevis_jpgvis_remap: - img_key_layer = self.settings.caffevis_jpgvis_remap[state_layer] + state_layer_name, state_selected_unit, cursor_area, show_unit_jpgs = self.state.get_default_layer_name(), self.state.selected_unit, self.state.cursor_area, self.state.show_unit_jpgs + + if self.settings.caffevis_jpgvis_remap and state_layer_name in self.settings.caffevis_jpgvis_remap: + img_key_layer = self.settings.caffevis_jpgvis_remap[state_layer_name] else: - img_key_layer = state_layer + img_key_layer = state_layer_name - if self.settings.caffevis_jpgvis_layers and img_key_layer in self.settings.caffevis_jpgvis_layers and cursor_area == 'bottom' and show_unit_jpgs: - img_key = (img_key_layer, state_selected_unit, pane.data.shape) + if ((self.settings.caffevis_jpgvis_layers and img_key_layer in self.settings.caffevis_jpgvis_layers) or (self.settings.caffevis_jpgvis_layers is None)) and \ + cursor_area == 'bottom' and show_unit_jpgs: + img_key = (img_key_layer, state_selected_unit, pane.data.shape, self.state.show_maximal_score) img_resize = self.img_cache.get(img_key, None) if img_resize is None: # If img_resize is None, loading has not yet been attempted, so show stale image and request load by JPGVisLoadingThread @@ -611,6 +1546,9 @@ def _draw_jpgvis_pane(self, pane): def handle_key(self, key, panes): return self.state.handle_key(key) + def handle_mouse_left_click(self, x, y, flags, param, panes): + self.state.handle_mouse_left_click(x, y, flags, param, panes, self.header_boxes, self.buttons_boxes) + def get_back_what_to_disp(self): '''Whether to show back diff information or stale or disabled indicator''' if (self.state.cursor_area == 'top' and not self.state.backprop_selection_frozen) or not self.state.back_enabled: @@ -634,8 +1572,7 @@ def draw_help(self, help_pane, locy): locx = loc_base[0] lines = [] - lines.append([FormattedString('', defaults)]) - lines.append([FormattedString('Caffevis keys', defaults)]) + lines.append([FormattedString('DeepVis keys', defaults)]) kl,_ = self.bindings.get_key_help('sel_left') kr,_ = self.bindings.get_key_help('sel_right') @@ -655,16 +1592,23 @@ def draw_help(self, help_pane, locy): nav_string = 'Navigate with %s%s. Use %s to move faster.' % (keys_nav_0, keys_nav_1, keys_nav_f) lines.append([FormattedString('', defaults, width=120, align='right'), FormattedString(nav_string, defaults)]) - - for tag in ('sel_layer_left', 'sel_layer_right', 'zoom_mode', 'pattern_mode', - 'ez_back_mode_loop', 'freeze_back_unit', 'show_back', 'back_mode', 'back_filt_mode', - 'boost_gamma', 'boost_individual', 'reset_state'): - key_strings, help_string = self.bindings.get_key_help(tag) - label = '%10s:' % (','.join(key_strings)) - lines.append([FormattedString(label, defaults, width=120, align='right'), - FormattedString(help_string, defaults)]) - - locy = cv2_typeset_text(help_pane.data, lines, (locx, locy), - line_spacing = self.settings.help_line_spacing) + + for tag in ('help_mode', 'static_file_increment', 'static_file_decrement', 'sel_layer_left', 'sel_layer_right', + '', 'next_pattern_mode', 'pattern_first_only', '', 'next_input_overlay', 'next_ez_back_mode_loop', + 'next_back_view_option', 'next_color_map', '', 'freeze_back_unit', 'show_back', 'zoom_mode', + 'siamese_view_mode', 'toggle_maximal_score', 'boost_gamma', 'boost_individual', 'freeze_cam', + 'toggle_input_mode', 'stretch_mode', '', 'reset_state', 'quit'): + + if (tag == ''): + lines.append([FormattedString('', defaults)]) + + else: + key_strings, help_string = self.bindings.get_key_help(tag) + label = '%10s:' % (','.join(key_strings)) + lines.append([FormattedString(label, defaults, width=120, align='right'), + FormattedString(help_string, defaults)]) + + locy, boxes = cv2_typeset_text(help_pane.data, lines, (locx, locy), + line_spacing = self.settings.help_line_spacing) return locy diff --git a/caffevis/caffe_proc_thread.py b/caffevis/caffe_proc_thread.py index 82b4bb7dc..97e5cea54 100644 --- a/caffevis/caffe_proc_thread.py +++ b/caffevis/caffe_proc_thread.py @@ -5,8 +5,8 @@ from codependent_thread import CodependentThread from misc import WithTimer from caffevis_helper import net_preproc_forward - - +from image_misc import resize_without_fit +from caffevis_app_state import BackpropMode class CaffeProcThread(CodependentThread): '''Runs Caffe in separate thread.''' @@ -25,7 +25,6 @@ def __init__(self, settings, net, state, loop_sleep, pause_after_keys, heartbeat self.pause_after_keys = pause_after_keys self.debug_level = 0 self.mode_gpu = mode_gpu # Needed so the mode can be set again in the spawned thread, because there is a separate Caffe object per thread. - self.settings = settings @@ -68,9 +67,7 @@ def run(self): back_enabled = self.state.back_enabled back_mode = self.state.back_mode back_stale = self.state.back_stale - #state_layer = self.state.layer - #selected_unit = self.state.selected_unit - backprop_layer = self.state.backprop_layer + backprop_layer_def = self.state.get_current_backprop_layer_definition() backprop_unit = self.state.backprop_unit # Forward should be run for every new frame @@ -85,41 +82,28 @@ def run(self): #print 'TIMING:, processing frame' self.frames_processed_fwd += 1 - if self.settings.static_files_input_mode == "siamese_image_list": - frame1 = frame[0] - frame2 = frame[1] - - im_small1 = cv2.resize(frame1, self.input_dims) - im_small2 = cv2.resize(frame2, self.input_dims) - im_small = np.concatenate( (im_small1, im_small2), axis=2) + if self.settings.is_siamese and ((type(frame), len(frame)) == (tuple, 2)): + im_small = self.state.convert_image_pair_to_network_input_format(self.settings, frame, self.input_dims) else: - im_small = cv2.resize(frame, self.input_dims) + im_small = resize_without_fit(frame, self.input_dims) with WithTimer('CaffeProcThread:forward', quiet = self.debug_level < 1): net_preproc_forward(self.settings, self.net, im_small, self.input_dims) if run_back: - diffs = self.net.blobs[backprop_layer].diff * 0 - diffs[0][backprop_unit] = self.net.blobs[backprop_layer].data[0,backprop_unit] - assert back_mode in ('grad', 'deconv') - if back_mode == 'grad': + if back_mode == BackpropMode.GRAD: with WithTimer('CaffeProcThread:backward', quiet = self.debug_level < 1): - #print '**** Doing backprop with %s diffs in [%s,%s]' % (backprop_layer, diffs.min(), diffs.max()) - try: - self.net.backward_from_layer(backprop_layer, diffs, zero_higher = True) - except AttributeError: - print 'ERROR: required bindings (backward_from_layer) not found! Try using the deconv-deep-vis-toolbox branch as described here: https://github.com/yosinski/deep-visualization-toolbox' - raise - else: + self.state.backward_from_layer(self.net, backprop_layer_def, backprop_unit) + + elif back_mode == BackpropMode.DECONV_ZF: with WithTimer('CaffeProcThread:deconv', quiet = self.debug_level < 1): - #print '**** Doing deconv with %s diffs in [%s,%s]' % (backprop_layer, diffs.min(), diffs.max()) - try: - self.net.deconv_from_layer(backprop_layer, diffs, zero_higher = True) - except AttributeError: - print 'ERROR: required bindings (deconv_from_layer) not found! Try using the deconv-deep-vis-toolbox branch as described here: https://github.com/yosinski/deep-visualization-toolbox' - raise + self.state.deconv_from_layer(self.net, backprop_layer_def, backprop_unit, 'Zeiler & Fergus') + + elif back_mode == BackpropMode.DECONV_GB: + with WithTimer('CaffeProcThread:deconv', quiet=self.debug_level < 1): + self.state.deconv_from_layer(self.net, backprop_layer_def, backprop_unit, 'Guided Backprop') with self.state.lock: self.state.back_stale = False diff --git a/caffevis/caffevis_app_state.py b/caffevis/caffevis_app_state.py index 904096d53..84c41c79b 100644 --- a/caffevis/caffevis_app_state.py +++ b/caffevis/caffevis_app_state.py @@ -1,52 +1,184 @@ +import os import time from threading import Lock +from siamese_helper import SiameseViewMode, SiameseHelper +from caffe_misc import layer_name_to_top_name +from image_misc import get_tiles_height_width_ratio, gray_to_colormap +class PatternMode: + OFF = 0 + MAXIMAL_OPTIMIZED_IMAGE = 1 + MAXIMAL_INPUT_IMAGE = 2 + WEIGHTS_HISTOGRAM = 3 + MAX_ACTIVATIONS_HISTOGRAM = 4 + WEIGHTS_CORRELATION = 5 + ACTIVATIONS_CORRELATION = 6 + NUMBER_OF_MODES = 7 +class BackpropMode: + OFF = 0 + GRAD = 1 + DECONV_ZF = 2 + DECONV_GB = 3 + NUMBER_OF_MODES = 4 + + @staticmethod + def to_string(back_mode): + if back_mode == BackpropMode.OFF: + return 'off' + elif back_mode == BackpropMode.GRAD: + return 'grad' + elif back_mode == BackpropMode.DECONV_ZF: + return 'deconv zf' + elif back_mode == BackpropMode.DECONV_GB: + return 'deconv gb' + + return 'n/a' + +class BackpropViewOption: + RAW = 0 + GRAY = 1 + NORM = 2 + NORM_BLUR = 3 + POS_SUM = 4 + HISTOGRAM = 5 + NUMBER_OF_OPTIONS = 6 + + @staticmethod + def to_string(back_view_option): + if back_view_option == BackpropViewOption.RAW: + return 'raw' + elif back_view_option == BackpropViewOption.GRAY: + return 'gray' + elif back_view_option == BackpropViewOption.NORM: + return 'norm' + elif back_view_option == BackpropViewOption.NORM_BLUR: + return 'normblur' + elif back_view_option == BackpropViewOption.POS_SUM: + return 'sum>0' + elif back_view_option == BackpropViewOption.HISTOGRAM: + return 'histogram' + + return 'n/a' + +class ColorMapOption: + GRAY = 0 + JET = 1 + PLASMA = 2 + NUMBER_OF_OPTIONS = 3 + + @staticmethod + def to_string(color_map_option): + if color_map_option == ColorMapOption.GRAY: + return 'gray' + elif color_map_option == ColorMapOption.JET: + return 'jet' + elif color_map_option == ColorMapOption.PLASMA: + return 'plasma' + + return 'n/a' + +class InputOverlayOption: + OFF = 0 + OVER_ACTIVE = 1 + OVER_INACTIVE = 2 + NUMBER_OF_OPTIONS = 3 + + @staticmethod + def to_string(input_overlay_option): + if input_overlay_option == InputOverlayOption.OFF: + return 'off' + elif input_overlay_option == InputOverlayOption.OVER_ACTIVE: + return 'over active' + elif input_overlay_option == InputOverlayOption.OVER_INACTIVE: + return 'over inactive' + + return 'n/a' class CaffeVisAppState(object): '''State of CaffeVis app.''' - def __init__(self, net, settings, bindings, net_layer_info): + def __init__(self, net, settings, bindings, live_vis): self.lock = Lock() # State is accessed in multiple threads self.settings = settings self.bindings = bindings - self._layers = net.blobs.keys() - self._layers = self._layers[1:] # chop off data layer - if hasattr(self.settings, 'caffevis_filter_layers'): - for name in self._layers: - if self.settings.caffevis_filter_layers(name): - print ' Layer filtered out by caffevis_filter_layers: %s' % name - self._layers = filter(lambda name: not self.settings.caffevis_filter_layers(name), self._layers) - self.net_layer_info = net_layer_info + self.net = net + self.live_vis = live_vis + + self.fill_layers_list(net) + + self.siamese_helper = SiameseHelper(settings.layers_list) + + self._populate_net_blob_info(net) + self.layer_boost_indiv_choices = self.settings.caffevis_boost_indiv_choices # 0-1, 0 is noop self.layer_boost_gamma_choices = self.settings.caffevis_boost_gamma_choices # 0-inf, 1 is noop self.caffe_net_state = 'free' # 'free', 'proc', or 'draw' self.extra_msg = '' self.back_stale = True # back becomes stale whenever the last back diffs were not computed using the current backprop unit and method (bprop or deconv) self.next_frame = None + self.next_label = None + self.next_filename = None + self.last_frame = None self.jpgvis_to_load_key = None self.last_key_at = 0 self.quit = False self._reset_user_state() + def _populate_net_blob_info(self, net): + '''For each blob, save the number of filters and precompute + tile arrangement (needed by CaffeVisAppState to handle keyboard navigation). + ''' + self.net_blob_info = {} + for key in net.blobs.keys(): + self.net_blob_info[key] = {} + # Conv example: (1, 96, 55, 55) + # FC example: (1, 1000) + blob_shape = net.blobs[key].data.shape + + # handle case when output is a single number per image in the batch + if (len(blob_shape) == 1): + blob_shape = (blob_shape[0], 1) + + self.net_blob_info[key]['isconv'] = (len(blob_shape) == 4) + self.net_blob_info[key]['data_shape'] = blob_shape[1:] # Chop off batch size + self.net_blob_info[key]['n_tiles'] = blob_shape[1] + self.net_blob_info[key]['tiles_rc'] = get_tiles_height_width_ratio(blob_shape[1], self.settings.caffevis_layers_aspect_ratio) + self.net_blob_info[key]['tile_rows'] = self.net_blob_info[key]['tiles_rc'][0] + self.net_blob_info[key]['tile_cols'] = self.net_blob_info[key]['tiles_rc'][1] + + def get_headers(self): + + headers = list() + for layer_def in self.settings.layers_list: + headers.append(SiameseHelper.get_header_from_layer_def(layer_def)) + + return headers + def _reset_user_state(self): + self.show_maximal_score = True + self.input_overlay_option = InputOverlayOption.OFF self.layer_idx = 0 - self.layer = self._layers[0] self.layer_boost_indiv_idx = self.settings.caffevis_boost_indiv_default_idx self.layer_boost_indiv = self.layer_boost_indiv_choices[self.layer_boost_indiv_idx] self.layer_boost_gamma_idx = self.settings.caffevis_boost_gamma_default_idx self.layer_boost_gamma = self.layer_boost_gamma_choices[self.layer_boost_gamma_idx] self.cursor_area = 'top' # 'top' or 'bottom' self.selected_unit = 0 + self.siamese_view_mode = SiameseViewMode.BOTH_IMAGES + # Which layer and unit (or channel) to use for backprop - self.backprop_layer = self.layer + self.backprop_layer_idx = self.layer_idx self.backprop_unit = self.selected_unit self.backprop_selection_frozen = False # If false, backprop unit tracks selected unit + self.backprop_siamese_view_mode = SiameseViewMode.BOTH_IMAGES self.back_enabled = False - self.back_mode = 'grad' # 'grad' or 'deconv' - self.back_filt_mode = 'raw' # 'raw', 'gray', 'norm', 'normblur' - self.pattern_mode = False # Whether or not to show desired patterns instead of activations in layers pane + self.back_mode = BackpropMode.OFF + self.back_view_option = BackpropViewOption.RAW + self.color_map_option = ColorMapOption.JET + self.pattern_mode = PatternMode.OFF # type of patterns to show instead of activations in layers pane: maximal optimized image, maximal input image, maximal histogram, off + self.pattern_first_only = True # should we load only the first pattern image for each neuron, or all the relevant images per neuron self.layers_pane_zoom_mode = 0 # 0: off, 1: zoom selected (and show pref in small pane), 2: zoom backprop self.layers_show_back = False # False: show forward activations. True: show backward diffs self.show_label_predictions = self.settings.caffevis_init_show_label_predictions @@ -54,7 +186,7 @@ def _reset_user_state(self): self.drawing_stale = True kh,_ = self.bindings.get_key_help('help_mode') self.extra_msg = '%s for help' % kh[0] - + def handle_key(self, key): #print 'Ignoring key:', key if key == -1: @@ -71,13 +203,12 @@ def handle_key(self, key): #self.selected_unit = self.selected_unit % ww # equivalent to scrolling all the way to the top row #self.cursor_area = 'top' # Then to the control pane self.layer_idx = max(0, self.layer_idx - 1) - self.layer = self._layers[self.layer_idx] + elif tag == 'sel_layer_right': #hh,ww = self.tiles_height_width #self.selected_unit = self.selected_unit % ww # equivalent to scrolling all the way to the top row #self.cursor_area = 'top' # Then to the control pane - self.layer_idx = min(len(self._layers) - 1, self.layer_idx + 1) - self.layer = self._layers[self.layer_idx] + self.layer_idx = min(len(self.settings.layers_list) - 1, self.layer_idx + 1) elif tag == 'sel_left': self.move_selection('left') @@ -103,64 +234,39 @@ def handle_key(self, key): elif tag == 'boost_gamma': self.layer_boost_gamma_idx = (self.layer_boost_gamma_idx + 1) % len(self.layer_boost_gamma_choices) self.layer_boost_gamma = self.layer_boost_gamma_choices[self.layer_boost_gamma_idx] - elif tag == 'pattern_mode': - self.pattern_mode = not self.pattern_mode - if self.pattern_mode and not hasattr(self.settings, 'caffevis_unit_jpg_dir'): - print 'Cannot switch to pattern mode; caffevis_unit_jpg_dir not defined in settings.py.' - self.pattern_mode = False + elif tag == 'next_pattern_mode': + self.set_pattern_mode((self.pattern_mode + 1) % PatternMode.NUMBER_OF_MODES) + + elif tag == 'prev_pattern_mode': + self.set_pattern_mode((self.pattern_mode - 1 + PatternMode.NUMBER_OF_MODES) % PatternMode.NUMBER_OF_MODES) + + elif tag == 'pattern_first_only': + self.pattern_first_only = not self.pattern_first_only + elif tag == 'show_back': - # If in pattern mode: switch to fwd/back. Else toggle fwd/back mode - if self.pattern_mode: - self.pattern_mode = False - else: - self.layers_show_back = not self.layers_show_back - if self.layers_show_back: - if not self.back_enabled: - self.back_enabled = True - self.back_stale = True - elif tag == 'back_mode': - if not self.back_enabled: - self.back_enabled = True - self.back_mode = 'grad' - self.back_stale = True - else: - if self.back_mode == 'grad': - self.back_mode = 'deconv' - self.back_stale = True - else: - self.back_enabled = False - elif tag == 'back_filt_mode': - if self.back_filt_mode == 'raw': - self.back_filt_mode = 'gray' - elif self.back_filt_mode == 'gray': - self.back_filt_mode = 'norm' - elif self.back_filt_mode == 'norm': - self.back_filt_mode = 'normblur' - else: - self.back_filt_mode = 'raw' - elif tag == 'ez_back_mode_loop': - # Cycle: - # off -> grad (raw) -> grad(gray) -> grad(norm) -> grad(normblur) -> deconv - if not self.back_enabled: - self.back_enabled = True - self.back_mode = 'grad' - self.back_filt_mode = 'raw' - self.back_stale = True - elif self.back_mode == 'grad' and self.back_filt_mode == 'raw': - self.back_filt_mode = 'norm' - elif self.back_mode == 'grad' and self.back_filt_mode == 'norm': - self.back_mode = 'deconv' - self.back_filt_mode = 'raw' - self.back_stale = True - else: - self.back_enabled = False + self.set_show_back(not self.layers_show_back) + + elif tag == 'next_ez_back_mode_loop': + self.set_back_mode((self.back_mode + 1) % BackpropMode.NUMBER_OF_MODES) + + elif tag == 'prev_ez_back_mode_loop': + self.set_back_mode((self.back_mode - 1 + BackpropMode.NUMBER_OF_MODES) % BackpropMode.NUMBER_OF_MODES) + + elif tag == 'next_back_view_option': + self.set_back_view_option((self.back_view_option + 1) % BackpropViewOption.NUMBER_OF_OPTIONS) + + elif tag == 'prev_back_view_option': + self.set_back_view_option((self.back_view_option - 1 + BackpropViewOption.NUMBER_OF_OPTIONS) % BackpropViewOption.NUMBER_OF_OPTIONS) + + elif tag == 'next_color_map': + self.color_map_option = (self.color_map_option + 1) % ColorMapOption.NUMBER_OF_OPTIONS + + elif tag == 'prev_color_map': + self.color_map_option = (self.color_map_option - 1 + ColorMapOption.NUMBER_OF_OPTIONS) % ColorMapOption.NUMBER_OF_OPTIONS + elif tag == 'freeze_back_unit': - # Freeze selected layer/unit as backprop unit - self.backprop_selection_frozen = not self.backprop_selection_frozen - if self.backprop_selection_frozen: - # Grap layer/selected_unit upon transition from non-frozen -> frozen - self.backprop_layer = self.layer - self.backprop_unit = self.selected_unit + self.toggle_freeze_back_unit() + elif tag == 'zoom_mode': self.layers_pane_zoom_mode = (self.layers_pane_zoom_mode + 1) % 3 if self.layers_pane_zoom_mode == 2 and not self.back_enabled: @@ -173,6 +279,18 @@ def handle_key(self, key): elif tag == 'toggle_unit_jpgs': self.show_unit_jpgs = not self.show_unit_jpgs + elif tag == 'siamese_view_mode': + self.siamese_view_mode = (self.siamese_view_mode + 1) % SiameseViewMode.NUMBER_OF_MODES + + elif tag == 'toggle_maximal_score': + self.show_maximal_score = not self.show_maximal_score + + elif tag == 'next_input_overlay': + self.set_input_overlay((self.input_overlay_option + 1) % InputOverlayOption.NUMBER_OF_OPTIONS) + + elif tag == 'prev_input_overlay': + self.set_input_overlay((self.input_overlay_option - 1 + InputOverlayOption.NUMBER_OF_OPTIONS) % InputOverlayOption.NUMBER_OF_OPTIONS) + else: key_handled = False @@ -182,39 +300,298 @@ def handle_key(self, key): return (None if key_handled else key) + def handle_mouse_left_click(self, x, y, flags, param, panes, header_boxes, buttons_boxes): + + for pane_name, pane in panes.items(): + if pane.j_begin <= x < pane.j_end and pane.i_begin <= y < pane.i_end: + + if pane_name == 'caffevis_control': # layers list + + # search for layer clicked on + for box_idx, box in enumerate(header_boxes): + start_x, end_x, start_y, end_y, text = box + if start_x <= x - pane.j_begin < end_x and start_y <= y - pane.i_begin <= end_y: + # print 'layers list clicked on layer %d (%s,%s)' % (box_idx, x, y) + self.layer_idx = box_idx + self.cursor_area = 'top' + self._ensure_valid_selected() + self.drawing_stale = True # Request redraw any time we handled the mouse + return + # print 'layers list clicked on (%s,%s)' % (x, y) + + elif pane_name == 'caffevis_layers': # channels list + # print 'channels list clicked on (%s,%s)' % (x, y) + + default_layer_name = self.get_default_layer_name() + default_top_name = layer_name_to_top_name(self.net, default_layer_name) + + tile_rows, tile_cols = self.net_blob_info[default_top_name]['tiles_rc'] + + dy_per_channel = (pane.data.shape[0] + 1) / float(tile_rows) + dx_per_channel = (pane.data.shape[1] + 1) / float(tile_cols) + + tile_x = int(((x - pane.j_begin) / dx_per_channel) + 1) + tile_y = int(((y - pane.i_begin) / dy_per_channel) + 1) + + channel_id = (tile_y-1) * tile_cols + (tile_x - 1) + + self.selected_unit = channel_id + self.cursor_area = 'bottom' + + self.validate_state_for_summary_only_patterns() + self._ensure_valid_selected() + self.drawing_stale = True # Request redraw any time we handled the mouse + return + + elif pane_name == 'caffevis_buttons': + # print 'buttons!' + + # search for layer clicked on + for box_idx, box in enumerate(buttons_boxes): + start_x, end_x, start_y, end_y, text = box + if start_x <= x - pane.j_begin < end_x and start_y <= y - pane.i_begin <= end_y: + # print 'DEBUG: pressed %s' % text + + if text == 'File': + self.live_vis.input_updater.set_mode_static() + pass + elif text == 'Prev': + self.live_vis.input_updater.prev_image() + pass + elif text == 'Next': + self.live_vis.input_updater.next_image() + pass + elif text == 'Camera': + self.live_vis.input_updater.set_mode_cam() + pass + + elif text == 'Modes': + pass + elif text == 'Activations': + self.set_show_back(False) + pass + elif text == 'Gradients': + self.set_show_back(True) + pass + elif text == 'Maximal Optimized': + with self.lock: + self.set_pattern_mode(PatternMode.MAXIMAL_OPTIMIZED_IMAGE) + pass + elif text == 'Maximal Input': + with self.lock: + self.set_pattern_mode(PatternMode.MAXIMAL_INPUT_IMAGE) + pass + elif text == 'Weights Histogram': + with self.lock: + self.set_pattern_mode(PatternMode.WEIGHTS_HISTOGRAM) + pass + elif text == 'Activations Histogram': + with self.lock: + self.set_pattern_mode(PatternMode.MAX_ACTIVATIONS_HISTOGRAM) + pass + elif text == 'Weights Correlation': + with self.lock: + self.set_pattern_mode(PatternMode.WEIGHTS_CORRELATION) + pass + elif text == 'Activations Correlation': + with self.lock: + self.set_pattern_mode(PatternMode.ACTIVATIONS_CORRELATION) + pass + + elif text == 'Input Overlay': + pass + elif text == 'No Overlay': + self.set_input_overlay(InputOverlayOption.OFF) + pass + elif text == 'Over Active': + self.set_input_overlay(InputOverlayOption.OVER_ACTIVE) + pass + elif text == 'Over Inactive': + self.set_input_overlay(InputOverlayOption.OVER_INACTIVE) + pass + + elif text == 'Backprop Modes': + pass + elif text == 'No Backprop': + self.set_back_mode(BackpropMode.OFF) + pass + elif text == 'Gradient': + self.set_back_mode(BackpropMode.GRAD) + pass + elif text == 'ZF Deconv': + self.set_back_mode(BackpropMode.DECONV_ZF) + pass + elif text == 'Guided Backprop': + self.set_back_mode(BackpropMode.DECONV_GB) + pass + elif text == 'Freeze Origin': + self.toggle_freeze_back_unit() + pass + + elif text == 'Backprop Views': + pass + elif text == 'Raw': + self.set_back_view_option(BackpropViewOption.RAW) + pass + elif text == 'Gray': + self.set_back_view_option(BackpropViewOption.GRAY) + pass + elif text == 'Norm': + self.set_back_view_option(BackpropViewOption.NORM) + pass + elif text == 'Blurred Norm': + self.set_back_view_option(BackpropViewOption.NORM_BLUR) + pass + elif text == 'Sum > 0': + self.set_back_view_option(BackpropViewOption.POS_SUM) + pass + elif text == 'Gradient Histogram': + self.set_back_view_option(BackpropViewOption.HISTOGRAM) + pass + + elif text == 'Help': + self.live_vis.toggle_help_mode() + pass + + elif text == 'Quit': + self.live_vis.set_quit_flag() + pass + + self._ensure_valid_selected() + self.drawing_stale = True + return + + else: + # print "Clicked: %s - %s" % (x, y) + pass + break + + pass + def redraw_needed(self): with self.lock: return self.drawing_stale + def get_current_layer_definition(self): + return self.settings.layers_list[self.layer_idx] + + def get_current_backprop_layer_definition(self): + return self.settings.layers_list[self.backprop_layer_idx] + + def get_single_selected_data_blob(self, net, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return self.siamese_helper.get_single_selected_data_blob(net, layer_def, self.siamese_view_mode) + + def get_single_selected_diff_blob(self, net, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return self.siamese_helper.get_single_selected_diff_blob(net, layer_def, self.siamese_view_mode) + + def get_siamese_selected_data_blobs(self, net, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return self.siamese_helper.get_siamese_selected_data_blobs(net, layer_def, self.siamese_view_mode) + + def get_siamese_selected_diff_blobs(self, net, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return self.siamese_helper.get_siamese_selected_diff_blobs(net, layer_def, self.siamese_view_mode) + + + def backward_from_layer(self, net, backprop_layer_def, backprop_unit): + + try: + return SiameseHelper.backward_from_layer(net, backprop_layer_def, backprop_unit, self.siamese_view_mode) + except AttributeError: + print 'ERROR: required bindings (backward_from_layer) not found! Try using the deconv-deep-vis-toolbox branch as described here: https://github.com/yosinski/deep-visualization-toolbox' + raise + except ValueError: + print "ERROR: probably impossible to backprop layer %s, ignoring to avoid crash" % (str(backprop_layer_def['name/s'])) + with self.lock: + self.back_enabled = False + + def deconv_from_layer(self, net, backprop_layer_def, backprop_unit, deconv_type): + + try: + return SiameseHelper.deconv_from_layer(net, backprop_layer_def, backprop_unit, self.siamese_view_mode, deconv_type) + except AttributeError: + print 'ERROR: required bindings (deconv_from_layer) not found! Try using the deconv-deep-vis-toolbox branch as described here: https://github.com/yosinski/deep-visualization-toolbox' + raise + except ValueError: + print "ERROR: probably impossible to deconv layer %s, ignoring to avoid crash" % (str(backprop_layer_def['name/s'])) + with self.lock: + self.back_enabled = False + + def get_default_layer_name(self, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return self.siamese_helper.get_default_layer_name(layer_def) + + def siamese_view_mode_has_two_images(self, layer_def = None): + ''' + helper function which checks whether the input mode is two images, and the provided layer contains two layer names + :param layer: can be a single string layer name, or a pair of layer names + :return: True if both the input mode is BOTH_IMAGES and layer contains two layer names, False oherwise + ''' + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return SiameseHelper.siamese_view_mode_has_two_images(layer_def, self.siamese_view_mode) + def move_selection(self, direction, dist = 1): + + default_layer_name = self.get_default_layer_name() + default_top_name = layer_name_to_top_name(self.net, default_layer_name) + if direction == 'left': if self.cursor_area == 'top': self.layer_idx = max(0, self.layer_idx - dist) - self.layer = self._layers[self.layer_idx] else: self.selected_unit -= dist elif direction == 'right': if self.cursor_area == 'top': - self.layer_idx = min(len(self._layers) - 1, self.layer_idx + dist) - self.layer = self._layers[self.layer_idx] + self.layer_idx = min(len(self.settings.layers_list) - 1, self.layer_idx + dist) else: self.selected_unit += dist elif direction == 'down': if self.cursor_area == 'top': self.cursor_area = 'bottom' else: - self.selected_unit += self.net_layer_info[self.layer]['tile_cols'] * dist + self.selected_unit += self.net_blob_info[default_top_name]['tile_cols'] * dist elif direction == 'up': if self.cursor_area == 'top': pass else: - self.selected_unit -= self.net_layer_info[self.layer]['tile_cols'] * dist + self.selected_unit -= self.net_blob_info[default_top_name]['tile_cols'] * dist if self.selected_unit < 0: - self.selected_unit += self.net_layer_info[self.layer]['tile_cols'] + self.selected_unit += self.net_blob_info[default_top_name]['tile_cols'] self.cursor_area = 'top' + self.validate_state_for_summary_only_patterns() + def _ensure_valid_selected(self): - n_tiles = self.net_layer_info[self.layer]['n_tiles'] + + default_layer_name = self.get_default_layer_name() + default_top_name = layer_name_to_top_name(self.net, default_layer_name) + + n_tiles = self.net_blob_info[default_top_name]['n_tiles'] # Forward selection self.selected_unit = max(0, self.selected_unit) @@ -223,7 +600,109 @@ def _ensure_valid_selected(self): # Backward selection if not self.backprop_selection_frozen: # If backprop_selection is not frozen, backprop layer/unit follows selected unit - if not (self.backprop_layer == self.layer and self.backprop_unit == self.selected_unit): - self.backprop_layer = self.layer + if not (self.backprop_layer_idx == self.layer_idx and self.backprop_unit == self.selected_unit and self.backprop_siamese_view_mode == self.siamese_view_mode): + self.backprop_layer_idx = self.layer_idx self.backprop_unit = self.selected_unit + self.backprop_siamese_view_mode = self.siamese_view_mode self.back_stale = True # If there is any change, back diffs are now stale + + def fill_layers_list(self, net): + + # if layers list is empty, fill it with layer names + if not self.settings.layers_list: + + # go over layers + self.settings.layers_list = [] + for layer_name in list(net._layer_names): + + # skip inplace layers + if len(net.top_names[layer_name]) == 1 and len(net.bottom_names[layer_name]) == 1 and net.top_names[layer_name][0] == net.bottom_names[layer_name][0]: + continue + + self.settings.layers_list.append( {'format': 'normal', 'name/s': layer_name} ) + + # filter layers if needed + if hasattr(self.settings, 'caffevis_filter_layers'): + for layer_def in self.settings.layers_list: + if self.settings.caffevis_filter_layers(layer_def['name/s']): + print ' Layer filtered out by caffevis_filter_layers: %s' % str(layer_def['name/s']) + self.settings.layers_list = filter(lambda layer_def: not self.settings.caffevis_filter_layers(layer_def['name/s']), self.settings.layers_list) + + + pass + + def gray_to_colormap(self, gray_image): + + if self.color_map_option == ColorMapOption.GRAY: + return gray_image + + elif self.color_map_option == ColorMapOption.JET: + return gray_to_colormap('jet', gray_image) + + elif self.color_map_option == ColorMapOption.PLASMA: + return gray_to_colormap('plasma', gray_image) + + def convert_image_pair_to_network_input_format(self, settings, frame_pair, resize_shape): + return SiameseHelper.convert_image_pair_to_network_input_format(frame_pair, resize_shape, settings.siamese_input_mode) + + def get_layer_output_size(self, layer_def = None): + + # if no layer specified, get current layer + if layer_def is None: + layer_def = self.get_current_layer_definition() + + return SiameseHelper.get_layer_output_size(self.net, self.settings.is_siamese, layer_def, self.siamese_view_mode) + + def get_layer_output_size_string(self, layer_def=None): + + layer_output_size = self.get_layer_output_size(layer_def) + + if len(layer_output_size) == 1: + return '(%d)' % (layer_output_size[0]) + elif len(layer_output_size) == 2: + return '(%d,%d)' % (layer_output_size[0],layer_output_size[1]) + elif len(layer_output_size) == 3: + return '(%d,%d,%d)' % (layer_output_size[0], layer_output_size[1], layer_output_size[2]) + else: + return str(layer_output_size) + + def validate_state_for_summary_only_patterns(self): + if self.pattern_mode in [PatternMode.ACTIVATIONS_CORRELATION, PatternMode.WEIGHTS_CORRELATION]: + self.cursor_area = 'top' + + def set_pattern_mode(self, pattern_mode): + self.pattern_mode = pattern_mode + self.validate_state_for_summary_only_patterns() + + def set_show_back(self, show_back): + # If in pattern mode: switch to fwd/back. Else toggle fwd/back mode + if self.pattern_mode != PatternMode.OFF: + self.set_pattern_mode(PatternMode.OFF) + + self.layers_show_back = show_back + if self.layers_show_back: + if not self.back_enabled: + if self.back_mode == BackpropMode.OFF: + self.back_mode = BackpropMode.GRAD + self.back_enabled = True + self.back_stale = True + + def set_input_overlay(self, input_overlay): + self.input_overlay_option = input_overlay + + def set_back_mode(self, back_mode): + self.back_mode = back_mode + self.back_enabled = (self.back_mode != BackpropMode.OFF) + self.back_stale = True + + def toggle_freeze_back_unit(self): + # Freeze selected layer/unit as backprop unit + self.backprop_selection_frozen = not self.backprop_selection_frozen + if self.backprop_selection_frozen: + # Grap layer/selected_unit upon transition from non-frozen -> frozen + self.backprop_layer_idx = self.layer_idx + self.backprop_unit = self.selected_unit + self.backprop_siamese_view_mode = self.siamese_view_mode + + def set_back_view_option(self, back_view_option): + self.back_view_option = back_view_option diff --git a/caffevis/caffevis_helper.py b/caffevis/caffevis_helper.py index b2880f5bf..ee91e82a9 100644 --- a/caffevis/caffevis_helper.py +++ b/caffevis/caffevis_helper.py @@ -1,14 +1,21 @@ -import numpy as np +#! /usr/bin/env python -from image_misc import get_tiles_height_width, caffe_load_image +from sys import float_info +import numpy as np +import os +import glob +from image_misc import get_tiles_height_width, caffe_load_image, ensure_uint255_and_resize_without_fit, FormattedString, \ + cv2_typeset_text, to_255 def net_preproc_forward(settings, net, img, data_hw): - if settings.static_files_input_mode == "siamese_image_list": + if settings.is_siamese and img.shape[2] == 6: appropriate_shape = data_hw + (6,) - else: + elif settings._calculated_is_gray_model: + appropriate_shape = data_hw + (1,) + else: # default is color appropriate_shape = data_hw + (3,) assert img.shape == appropriate_shape, 'img is wrong size (got %s but expected %s)' % (img.shape, appropriate_shape) @@ -17,6 +24,7 @@ def net_preproc_forward(settings, net, img, data_hw): data_blob = net.transformer.preprocess('data', img) # e.g. (3, 227, 227), mean subtracted and scaled to [0,255] data_blob = data_blob[np.newaxis,:,:,:] # e.g. (1, 3, 227, 227) output = net.forward(data=data_blob) + return output @@ -24,9 +32,9 @@ def get_pretty_layer_name(settings, layer_name): has_old_settings = hasattr(settings, 'caffevis_layer_pretty_names') has_new_settings = hasattr(settings, 'caffevis_layer_pretty_name_fn') if has_old_settings and not has_new_settings: - print ('WARNING: Your settings.py and/or settings_local.py are out of date.' + print ('WARNING: Your settings.py and/or settings_model_selector.py are out of date.' 'caffevis_layer_pretty_names has been replaced with caffevis_layer_pretty_name_fn.' - 'Update your settings.py and/or settings_local.py (see documentation in' + 'Update your settings.py and/or settings_model_selector.py (see documentation in' 'setttings.py) to remove this warning.') return settings.caffevis_layer_pretty_names.get(layer_name, layer_name) @@ -121,3 +129,140 @@ def check_force_backward_true(prototxt_file): print 'to your prototxt file before continuing to force backprop to compute derivatives' print 'at the data layer as well.\n\n' + +def load_mean_file(data_mean_file): + filename, file_extension = os.path.splitext(data_mean_file) + if file_extension == ".npy": + # load mean from numpy array + data_mean = np.load(data_mean_file) + print "Loaded mean from numpy file, data_mean.shape: ", data_mean.shape + + elif file_extension == ".binaryproto": + + # load mean from binary protobuf file + import caffe + blob = caffe.proto.caffe_pb2.BlobProto() + data = open(data_mean_file, 'rb').read() + blob.ParseFromString(data) + data_mean = np.array(caffe.io.blobproto_to_array(blob)) + data_mean = np.squeeze(data_mean) + print "Loaded mean from binaryproto file, data_mean.shape: ", data_mean.shape + + else: + # unknown file extension, trying to load as numpy array + data_mean = np.load(data_mean_file) + print "Loaded mean from numpy file, data_mean.shape: ", data_mean.shape + + return data_mean + +def set_mean(caffevis_data_mean, generate_channelwise_mean, net): + + if isinstance(caffevis_data_mean, basestring): + # If the mean is given as a filename, load the file + try: + data_mean = load_mean_file(caffevis_data_mean) + except IOError: + print '\n\nCound not load mean file:', caffevis_data_mean + print 'Ensure that the values in settings.py point to a valid model weights file, network' + print 'definition prototxt, and mean. To fetch a default model and mean file, use:\n' + print '$ cd models/caffenet-yos/' + print '$ ./fetch.sh\n\n' + raise + input_shape = net.blobs[net.inputs[0]].data.shape[-2:] # e.g. 227x227 + # Crop center region (e.g. 227x227) if mean is larger (e.g. 256x256) + excess_h = data_mean.shape[1] - input_shape[0] + excess_w = data_mean.shape[2] - input_shape[1] + assert excess_h >= 0 and excess_w >= 0, 'mean should be at least as large as %s' % repr(input_shape) + data_mean = data_mean[:, (excess_h / 2):(excess_h / 2 + input_shape[0]), + (excess_w / 2):(excess_w / 2 + input_shape[1])] + elif caffevis_data_mean is None: + data_mean = None + else: + # The mean has been given as a value or a tuple of values + data_mean = np.array(caffevis_data_mean) + # Promote to shape C,1,1 + # while len(data_mean.shape) < 3: + # data_mean = np.expand_dims(data_mean, -1) + + if generate_channelwise_mean: + data_mean = data_mean.mean(1).mean(1) + + if data_mean is not None: + print 'Using mean with shape:', data_mean.shape + net.transformer.set_mean(net.inputs[0], data_mean) + + return data_mean + + +def get_image_from_files(settings, unit_folder_path, should_crop_to_corner, resize_shape, first_only, captions = [], values = []): + try: + + # list unit images + unit_images_path = sorted(glob.glob(unit_folder_path)) + + mega_image = np.zeros((resize_shape[0], resize_shape[1], 3), dtype=np.uint8) + + # if no images + if not unit_images_path: + return mega_image + + if first_only: + unit_images_path = [unit_images_path[0]] + + # load all images + unit_images = [caffe_load_image(unit_image_path, color=True, as_uint=True) for unit_image_path in + unit_images_path] + + if settings.caffevis_clear_negative_activations: + # clear images with 0 value + if values: + for i in range(len(values)): + if values[i] < float_info.epsilon: + unit_images[i] *= 0 + + if should_crop_to_corner: + unit_images = [crop_to_corner(img, 2) for img in unit_images] + + num_images = len(unit_images) + images_per_axis = int(np.math.ceil(np.math.sqrt(num_images))) + padding_pixel = 1 + + if first_only: + single_resized_image_shape = (resize_shape[0] - 2*padding_pixel, resize_shape[1] - 2*padding_pixel) + else: + single_resized_image_shape = ((resize_shape[0] / images_per_axis) - 2*padding_pixel, (resize_shape[1] / images_per_axis) - 2*padding_pixel) + unit_images = [ensure_uint255_and_resize_without_fit(unit_image, single_resized_image_shape) for unit_image in unit_images] + + # build mega image + + should_add_caption = (len(captions) == num_images) + defaults = {'face': settings.caffevis_score_face, + 'fsize': settings.caffevis_score_fsize, + 'clr': to_255(settings.caffevis_score_clr), + 'thick': settings.caffevis_score_thick} + + for i in range(num_images): + + # add caption if we have exactly one for each image + if should_add_caption: + loc = settings.caffevis_score_loc[::-1] # Reverse to OpenCV c,r order + fs = FormattedString(captions[i], defaults) + cv2_typeset_text(unit_images[i], [[fs]], loc) + + cell_row = i / images_per_axis + cell_col = i % images_per_axis + mega_image_height_start = 1 + cell_row * (single_resized_image_shape[0] + 2 * padding_pixel) + mega_image_height_end = mega_image_height_start + single_resized_image_shape[0] + mega_image_width_start = 1 + cell_col * (single_resized_image_shape[1] + 2 * padding_pixel) + mega_image_width_end = mega_image_width_start + single_resized_image_shape[1] + mega_image[mega_image_height_start:mega_image_height_end, mega_image_width_start:mega_image_width_end,:] = unit_images[i] + + return mega_image + + except: + print '\nAttempted to load files from %s but failed. ' % unit_folder_path + # set black image as place holder + return np.zeros((resize_shape[0], resize_shape[1], 3), dtype=np.uint8) + pass + + return \ No newline at end of file diff --git a/caffevis/jpg_vis_loading_thread.py b/caffevis/jpg_vis_loading_thread.py index 1bc69c52a..c204114c2 100644 --- a/caffevis/jpg_vis_loading_thread.py +++ b/caffevis/jpg_vis_loading_thread.py @@ -1,12 +1,17 @@ import os import time + +import cv2 import numpy as np +import glob +import math from codependent_thread import CodependentThread -from image_misc import caffe_load_image, ensure_uint255_and_resize_to_fit -from caffevis_helper import crop_to_corner - +from image_misc import caffe_load_image, ensure_uint255_and_resize_to_fit, \ + ensure_uint255_and_resize_without_fit +from caffevis_helper import crop_to_corner, get_image_from_files +import caffe class JPGVisLoadingThread(CodependentThread): '''Loads JPGs necessary for caffevis_jpgvis pane in separate @@ -21,7 +26,62 @@ def __init__(self, settings, state, cache, loop_sleep, heartbeat_required): self.cache = cache self.loop_sleep = loop_sleep self.debug_level = 0 - + + + def load_image_into_pane_original_format(self, state_layer_name, state_selected_unit, resize_shape, images, sub_folder, + file_pattern, image_index_to_set, should_crop_to_corner=False): + + jpg_path = os.path.join(self.settings.caffevis_outputs_dir, sub_folder, state_layer_name, + file_pattern % (state_layer_name, state_selected_unit)) + + try: + img = caffe.io.load_image(jpg_path) + + if should_crop_to_corner: + img = crop_to_corner(img, 2) + images[image_index_to_set] = ensure_uint255_and_resize_without_fit(img, resize_shape) + + except IOError: + print '\nAttempted to load file %s but failed. To supress this warning, remove layer "%s" from settings.caffevis_jpgvis_layers' % \ + (jpg_path, state_layer_name) + # set black image as place holder + images[image_index_to_set] = np.zeros((resize_shape[0], resize_shape[1], 3), dtype=np.uint8) + pass + + def get_score_values_for_max_input_images(self, state_layer_name, state_selected_unit): + + try: + + info_file_path = os.path.join(self.settings.caffevis_outputs_dir, state_layer_name, + "unit_%04d" % (state_selected_unit), + "info.txt") + + # open file + with open(info_file_path, 'r') as info_file: + lines = info_file.readlines() + + # skip first line + lines = lines[1:] + + # take second word from each line, and convert to float + values = [float(line.split(' ')[1]) for line in lines] + + return values + + except IOError: + return [] + pass + + def load_image_into_pane_max_tracker_format(self, state_layer_name, state_selected_unit, resize_shape, images, + file_search_pattern, image_index_to_set, should_crop_to_corner=False, first_only = False, captions = [], values = []): + + unit_folder_path = os.path.join(self.settings.caffevis_outputs_dir, state_layer_name, + "unit_%04d" % (state_selected_unit), + file_search_pattern) + + images[image_index_to_set] = get_image_from_files(self.settings, unit_folder_path, should_crop_to_corner, resize_shape, first_only, captions, values) + return + def run(self): print 'JPGVisLoadingThread.run called' @@ -42,7 +102,7 @@ def run(self): time.sleep(self.loop_sleep) continue - state_layer, state_selected_unit, data_shape = jpgvis_to_load_key + state_layer_name, state_selected_unit, data_shape, show_maximal_score = jpgvis_to_load_key # Load three images: images = [None] * 3 @@ -58,41 +118,59 @@ def run(self): resize_shape = (data_shape[0]/3, data_shape[1]) else: resize_shape = (data_shape[0], data_shape[1]/3) - - # 0. e.g. regularized_opt/conv1/conv1_0037_montage.jpg - jpg_path = os.path.join(self.settings.caffevis_unit_jpg_dir, - 'regularized_opt', - state_layer, - '%s_%04d_montage.jpg' % (state_layer, state_selected_unit)) - try: - img = caffe_load_image(jpg_path, color = True) - img_corner = crop_to_corner(img, 2) - images[0] = ensure_uint255_and_resize_to_fit(img_corner, resize_shape) - except IOError: - print '\nAttempted to load file %s but failed. To supress this warning, remove layer "%s" from settings.caffevis_jpgvis_layers' % (jpg_path, state_layer) - pass - - # 1. e.g. max_im/conv1/conv1_0037.jpg - jpg_path = os.path.join(self.settings.caffevis_unit_jpg_dir, - 'max_im', - state_layer, - '%s_%04d.jpg' % (state_layer, state_selected_unit)) - try: - img = caffe_load_image(jpg_path, color = True) - images[1] = ensure_uint255_and_resize_to_fit(img, resize_shape) - except IOError: - pass - - # 2. e.g. max_deconv/conv1/conv1_0037.jpg - try: - jpg_path = os.path.join(self.settings.caffevis_unit_jpg_dir, - 'max_deconv', - state_layer, - '%s_%04d.jpg' % (state_layer, state_selected_unit)) - img = caffe_load_image(jpg_path, color = True) - images[2] = ensure_uint255_and_resize_to_fit(img, resize_shape) - except IOError: - pass + + + if self.settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image': + + # 0. e.g. regularized_opt/conv1/conv1_0037_montage.jpg + self.load_image_into_pane_original_format(state_layer_name, state_selected_unit, resize_shape, images, + sub_folder='regularized_opt', + file_pattern='%s_%04d_montage.jpg', + image_index_to_set=0, + should_crop_to_corner=True) + + elif self.settings.caffevis_outputs_dir_folder_format == 'max_tracker_output': + self.load_image_into_pane_max_tracker_format(state_layer_name, state_selected_unit, resize_shape, images, + file_search_pattern='opt*.jpg', + image_index_to_set=0) + + if self.settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image': + + # 1. e.g. max_im/conv1/conv1_0037.jpg + self.load_image_into_pane_original_format(state_layer_name, state_selected_unit, resize_shape, images, + sub_folder='max_im', + file_pattern='%s_%04d.jpg', + image_index_to_set=1) + + elif self.settings.caffevis_outputs_dir_folder_format == 'max_tracker_output': + + # convert to string with 2 decimal digits + values = self.get_score_values_for_max_input_images(state_layer_name, state_selected_unit) + + if self.state.show_maximal_score: + captions = [('%.2f' % value) for value in values] + else: + captions = [] + self.load_image_into_pane_max_tracker_format(state_layer_name, state_selected_unit, resize_shape, images, + file_search_pattern='maxim*.png', + image_index_to_set=1, captions=captions, values=values) + + + if self.settings.caffevis_outputs_dir_folder_format == 'original_combined_single_image': + + # 2. e.g. max_deconv/conv1/conv1_0037.jpg + self.load_image_into_pane_original_format(state_layer_name, state_selected_unit, resize_shape, images, + sub_folder='max_deconv', + file_pattern='%s_%04d.jpg', + image_index_to_set=2) + + elif self.settings.caffevis_outputs_dir_folder_format == 'max_tracker_output': + # convert to string with 2 decimal digits + values = self.get_score_values_for_max_input_images(state_layer_name, state_selected_unit) + + self.load_image_into_pane_max_tracker_format(state_layer_name, state_selected_unit, resize_shape, images, + file_search_pattern='deconv*.png', + image_index_to_set=2, values=values) # Prune images that were not found: images = [im for im in images if im is not None] diff --git a/clean_build.sh b/clean_build.sh new file mode 100755 index 000000000..f352ecd63 --- /dev/null +++ b/clean_build.sh @@ -0,0 +1,2 @@ +cd caffe +make clean diff --git a/doc/activation-correlation.png b/doc/activation-correlation.png new file mode 100644 index 000000000..982b72949 Binary files /dev/null and b/doc/activation-correlation.png differ diff --git a/doc/basic-layout.png b/doc/basic-layout.png new file mode 100644 index 000000000..57b403810 Binary files /dev/null and b/doc/basic-layout.png differ diff --git a/doc/computing_per_unit_visualizations.md b/doc/computing_per_unit_visualizations.md index e1bca78e2..84e61fea9 100644 --- a/doc/computing_per_unit_visualizations.md +++ b/doc/computing_per_unit_visualizations.md @@ -8,4 +8,4 @@ But the per-unit visualizations can be computed for any network: * To find images (for FC layers) or crops (for conv layers) from a set of images (e.g. the ImageNet training or validation set) that cause highest activation, use the [find_max_acts.py](/find_maxes/find_max_acts.py) script to go through the set of images and note the top K images/crops and then [crop_max_patches.py](/find_maxes/crop_max_patches.py) to use the noted max images / max locations to output the crops and/or deconv of the crops. -Results of both of the above steps will be saved as per-unit jpg image files, which can be loaded by the toolbox when browsing units. To do so, just point the `caffevis_unit_jpg_dir` setting to the directory containing the per-unit images. +Results of both of the above steps will be saved as per-unit jpg image files, which can be loaded by the toolbox when browsing units. To do so, just point the `caffevis_outputs_dir` setting to the directory containing the per-unit images. diff --git a/doc/detect-inactive-channels.png b/doc/detect-inactive-channels.png new file mode 100644 index 000000000..2c2590e53 Binary files /dev/null and b/doc/detect-inactive-channels.png differ diff --git a/doc/detect-inactive-layers.png b/doc/detect-inactive-layers.png new file mode 100644 index 000000000..843838ce8 Binary files /dev/null and b/doc/detect-inactive-layers.png differ diff --git a/doc/guided-backprop.png b/doc/guided-backprop.png new file mode 100644 index 000000000..d26f5f3de Binary files /dev/null and b/doc/guided-backprop.png differ diff --git a/doc/maximal-input.png b/doc/maximal-input.png new file mode 100644 index 000000000..b493a8933 Binary files /dev/null and b/doc/maximal-input.png differ diff --git a/doc/maximal-optimized.png b/doc/maximal-optimized.png new file mode 100644 index 000000000..8bbae5631 Binary files /dev/null and b/doc/maximal-optimized.png differ diff --git a/find_maxes/__init__.py b/find_maxes/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/find_maxes/caffe_misc.py b/find_maxes/caffe_misc.py deleted file mode 100644 index d126650bb..000000000 --- a/find_maxes/caffe_misc.py +++ /dev/null @@ -1,149 +0,0 @@ -#! /usr/bin/env python - -import skimage.io -import numpy as np - - - -def shownet(net): - '''Print some stats about a net and its activations''' - - print '%-41s%-31s%s' % ('', 'acts', 'act diffs') - print '%-45s%-31s%s' % ('', 'params', 'param diffs') - for k, v in net.blobs.items(): - if k in net.params: - params = net.params[k] - for pp, blob in enumerate(params): - if pp == 0: - print ' ', 'P: %-5s'%k, - else: - print ' ' * 11, - print '%-32s' % repr(blob.data.shape), - print '%-30s' % ('(%g, %g)' % (blob.data.min(), blob.data.max())), - print '(%g, %g)' % (blob.diff.min(), blob.diff.max()) - print '%-5s'%k, '%-34s' % repr(v.data.shape), - print '%-30s' % ('(%g, %g)' % (v.data.min(), v.data.max())), - print '(%g, %g)' % (v.diff.min(), v.diff.max()) - - - -def region_converter(top_slice, bot_size, top_size, filter_width = (1,1), stride = (1,1), pad = (0,0)): - ''' - Works for conv or pool - -vector ConvolutionLayer::JBY_region_of_influence(const vector& slice) { - + CHECK_EQ(slice.size(), 4) << "slice must have length 4 (ii_start, ii_end, jj_start, jj_end)"; - + // Crop region to output size - + vector sl = vector(slice); - + sl[0] = max(0, min(height_out_, slice[0])); - + sl[1] = max(0, min(height_out_, slice[1])); - + sl[2] = max(0, min(width_out_, slice[2])); - + sl[3] = max(0, min(width_out_, slice[3])); - + vector roi; - + roi.resize(4); - + roi[0] = sl[0] * stride_h_ - pad_h_; - + roi[1] = (sl[1]-1) * stride_h_ + kernel_h_ - pad_h_; - + roi[2] = sl[2] * stride_w_ - pad_w_; - + roi[3] = (sl[3]-1) * stride_w_ + kernel_w_ - pad_w_; - + return roi; - +} - ''' - assert len(top_slice) == 4 - assert len(bot_size) == 2 - assert len(top_size) == 2 - assert len(filter_width) == 2 - assert len(stride) == 2 - assert len(pad) == 2 - - # Crop top slice to allowable region - top_slice = [ss for ss in top_slice] # Copy list or array -> list - - top_slice[0] = max(0, min(top_size[0], top_slice[0])) - top_slice[1] = max(0, min(top_size[0], top_slice[1])) - top_slice[2] = max(0, min(top_size[1], top_slice[2])) - top_slice[3] = max(0, min(top_size[1], top_slice[3])) - - bot_slice = [-123] * 4 - - bot_slice[0] = top_slice[0] * stride[0] - pad[0]; - bot_slice[1] = (top_slice[1]-1) * stride[0] + filter_width[0] - pad[0]; - bot_slice[2] = top_slice[2] * stride[1] - pad[1]; - bot_slice[3] = (top_slice[3]-1) * stride[1] + filter_width[1] - pad[1]; - - return bot_slice - -def get_conv_converter(bot_size, top_size, filter_width = (1,1), stride = (1,1), pad = (0,0)): - return lambda top_slice : region_converter(top_slice, bot_size, top_size, filter_width, stride, pad) - -def get_pool_converter(bot_size, top_size, filter_width = (1,1), stride = (1,1), pad = (0,0)): - return lambda top_slice : region_converter(top_slice, bot_size, top_size, filter_width, stride, pad) - - - -class RegionComputer(object): - '''Computes regions of possible influcence from higher layers to lower layers. - - Woefully hardcoded''' - - def __init__(self): - #self.net = net - _tmp = [] - _tmp.append(('data', None)) - _tmp.append(('conv1', get_conv_converter((227,227), (55,55), (11,11), (4,4)))) - _tmp.append(('pool1', get_pool_converter((55,55), (27,27), (3,3), (2,2)))) - _tmp.append(('conv2', get_conv_converter((27,27), (27,27), (5,5), (1,1), (2,2)))) - _tmp.append(('pool2', get_pool_converter((27,27), (13,13), (3,3), (2,2)))) - _tmp.append(('conv3', get_conv_converter((13,13), (13,13), (3,3), (1,1), (1,1)))) - _tmp.append(('conv4', get_conv_converter((13,13), (13,13), (3,3), (1,1), (1,1)))) - _tmp.append(('conv5', get_conv_converter((13,13), (13,13), (3,3), (1,1), (1,1)))) - self.names = [tt[0] for tt in _tmp] - self.converters = [tt[1] for tt in _tmp] - - def convert_region(self, from_layer, to_layer, region, verbose = False): - '''region is the slice of the from_layer in the following Python - index format: (ii_start, ii_end, jj_start, jj_end) - ''' - - from_idx = self.names.index(from_layer) - to_idx = self.names.index(to_layer) - assert from_idx >= to_idx, 'wrong order of from_layer and to_layer' - - ret = region - for ii in range(from_idx, to_idx, -1): - converter = self.converters[ii] - if verbose: - print 'pushing', self.names[ii], 'region', ret, 'through converter' - ret = converter(ret) - if verbose: - print 'Final region at ', self.names[to_idx], 'is', ret - - return ret - - - -def norm01c(arr, center): - '''Maps the input range to [0,1] such that the center value maps to .5''' - arr = arr.copy() - arr -= center - arr /= max(2 * arr.max(), -2 * arr.min()) + 1e-10 - arr += .5 - assert arr.min() >= 0 - assert arr.max() <= 1 - return arr - - - -def save_caffe_image(img, filename, autoscale = True, autoscale_center = None): - '''Takes an image in caffe format (01) or (c01, BGR) and saves it to a file''' - if len(img.shape) == 2: - # upsample grayscale 01 -> 01c - img = np.tile(img[:,:,np.newaxis], (1,1,3)) - else: - img = img[::-1].transpose((1,2,0)) - if autoscale_center is not None: - img = norm01c(img, autoscale_center) - elif autoscale: - img = img.copy() - img -= img.min() - img *= 1.0 / (img.max() + 1e-10) - skimage.io.imsave(filename, img) diff --git a/find_maxes/crop_max_patches.py b/find_maxes/crop_max_patches.py old mode 100644 new mode 100755 index 20161ace9..b40393b67 --- a/find_maxes/crop_max_patches.py +++ b/find_maxes/crop_max_patches.py @@ -1,67 +1,97 @@ #! /usr/bin/env python +# this import must comes first to make sure we use the non-display backend +import matplotlib +matplotlib.use('Agg') + +# add parent folder to search path, to enable import of core modules like settings +import os,sys,inspect +currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) +parentdir = os.path.dirname(currentdir) +sys.path.insert(0,parentdir) + import argparse -import ipdb as pdb +#import ipdb as pdb import cPickle as pickle -from loaders import load_imagenet_mean, load_labels, caffe +import settings +from caffevis.caffevis_helper import set_mean +from siamese_helper import SiameseHelper + from jby_misc import WithTimer from max_tracker import output_max_patches - +from find_max_acts import load_max_tracker_from_file +from settings_misc import load_network def main(): parser = argparse.ArgumentParser(description='Loads a pickled NetMaxTracker and outputs one or more of {the patches of the image, a deconv patch, a backprop patch} associated with the maxes.') - parser.add_argument('--N', type = int, default = 9, help = 'Note and save top N activations.') - parser.add_argument('--gpu', action = 'store_true', help = 'Use gpu.') - parser.add_argument('--do-maxes', action = 'store_true', help = 'Output max patches.') - parser.add_argument('--do-deconv', action = 'store_true', help = 'Output deconv patches.') - parser.add_argument('--do-deconv-norm', action = 'store_true', help = 'Output deconv-norm patches.') - parser.add_argument('--do-backprop', action = 'store_true', help = 'Output backprop patches.') - parser.add_argument('--do-backprop-norm', action = 'store_true', help = 'Output backprop-norm patches.') - parser.add_argument('--do-info', action = 'store_true', help = 'Output info file containing max filenames and labels.') - parser.add_argument('--idx-begin', type = int, default = None, help = 'Start at this unit (default: all units).') - parser.add_argument('--idx-end', type = int, default = None, help = 'End at this unit (default: all units).') + parser.add_argument('--N', type = int, default = 9, help = 'Note and save top N activations.') + parser.add_argument('--gpu', action = 'store_true', default=settings.caffevis_mode_gpu, help = 'Use gpu.') + parser.add_argument('--do-maxes', action = 'store_true', default=settings.max_tracker_do_maxes, help = 'Output max patches.') + parser.add_argument('--do-deconv', action = 'store_true', default=settings.max_tracker_do_deconv, help = 'Output deconv patches.') + parser.add_argument('--do-deconv-norm', action = 'store_true', default=settings.max_tracker_do_deconv_norm, help = 'Output deconv-norm patches.') + parser.add_argument('--do-backprop', action = 'store_true', default=settings.max_tracker_do_backprop, help = 'Output backprop patches.') + parser.add_argument('--do-backprop-norm', action = 'store_true', default=settings.max_tracker_do_backprop_norm, help = 'Output backprop-norm patches.') + parser.add_argument('--do-info', action = 'store_true', default=settings.max_tracker_do_info, help = 'Output info file containing max filenames and labels.') + parser.add_argument('--idx-begin', type = int, default = None, help = 'Start at this unit (default: all units).') + parser.add_argument('--idx-end', type = int, default = None, help = 'End at this unit (default: all units).') - parser.add_argument('nmt_pkl', type = str, help = 'Which pickled NetMaxTracker to load.') - parser.add_argument('net_prototxt', type = str, help = 'Network prototxt to load') - parser.add_argument('net_weights', type = str, help = 'Network weights to load') - parser.add_argument('datadir', type = str, help = 'Directory to look for files in') - parser.add_argument('filelist', type = str, help = 'List of image files to consider, one per line. Must be the same filelist used to produce the NetMaxTracker!') - parser.add_argument('outdir', type = str, help = r'Which output directory to use. Files are output into outdir/layer/unit_%%04d/{maxes,deconv,backprop}_%%03d.png') - parser.add_argument('layer', type = str, help = 'Which layer to output') - #parser.add_argument('--mean', type = str, default = '', help = 'data mean to load') + parser.add_argument('--nmt_pkl', type = str, default = os.path.join(settings.caffevis_outputs_dir, 'find_max_acts_output.pickled'), help = 'Which pickled NetMaxTracker to load.') + parser.add_argument('--net_prototxt', type = str, default = settings.caffevis_deploy_prototxt, help = 'network prototxt to load') + parser.add_argument('--net_weights', type = str, default = settings.caffevis_network_weights, help = 'network weights to load') + parser.add_argument('--datadir', type = str, default = settings.static_files_dir, help = 'directory to look for files in') + parser.add_argument('--filelist', type = str, default = settings.static_files_input_file, help = 'List of image files to consider, one per line. Must be the same filelist used to produce the NetMaxTracker!') + parser.add_argument('--outdir', type = str, default = settings.caffevis_outputs_dir, help = 'Which output directory to use. Files are output into outdir/layer/unit_%%04d/{maxes,deconv,backprop}_%%03d.png') + parser.add_argument('--search-min', action='store_true', default=False, help='Should we also search for minimal activations?') args = parser.parse_args() - if args.gpu: - caffe.set_mode_gpu() - else: - caffe.set_mode_cpu() + settings.caffevis_deploy_prototxt = args.net_prototxt + settings.caffevis_network_weights = args.net_weights - imagenet_mean = load_imagenet_mean() - net = caffe.Classifier(args.net_prototxt, args.net_weights, - mean=imagenet_mean, - channel_swap=(2,1,0), - raw_scale=255, - image_dims=(256, 256)) + net, data_mean = load_network(settings) + + # validate batch size + if settings.is_siamese and settings._calculated_siamese_network_format == 'siamese_batch_pair': + # currently, no batch support for siamese_batch_pair networks + # it can be added by simply handle the batch indexes properly, but it should be thoroughly tested + assert (settings.max_tracker_batch_size == 1) + + # set network batch size + current_input_shape = net.blobs[net.inputs[0]].shape + current_input_shape[0] = settings.max_tracker_batch_size + net.blobs[net.inputs[0]].reshape(*current_input_shape) + net.reshape() assert args.do_maxes or args.do_deconv or args.do_deconv_norm or args.do_backprop or args.do_backprop_norm or args.do_info, 'Specify at least one do_* option to output.' - with open(args.nmt_pkl, 'rb') as ff: - nmt = pickle.load(ff) - mt = nmt.max_trackers[args.layer] + siamese_helper = SiameseHelper(settings.layers_list) - if args.idx_begin is None: - args.idx_begin = 0 - if args.idx_end is None: - args.idx_end = mt.max_vals.shape[0] - - with WithTimer('Saved %d images per unit for %s units %d:%d.' % (args.N, args.layer, args.idx_begin, args.idx_end)): - output_max_patches(mt, net, args.layer, args.idx_begin, args.idx_end, - args.N, args.datadir, args.filelist, args.outdir, - (args.do_maxes, args.do_deconv, args.do_deconv_norm, args.do_backprop, args.do_backprop_norm, args.do_info)) + nmt = load_max_tracker_from_file(args.nmt_pkl) + + for layer_name in settings.layers_to_output_in_offline_scripts: + + print 'Started work on layer %s' % (layer_name) + + normalized_layer_name = siamese_helper.normalize_layer_name_for_max_tracker(layer_name) + + mt = nmt.max_trackers[normalized_layer_name] + + if args.idx_begin is None: + idx_begin = 0 + if args.idx_end is None: + idx_end = mt.max_vals.shape[0] + + with WithTimer('Saved %d images per unit for %s units %d:%d.' % (args.N, normalized_layer_name, idx_begin, idx_end)): + output_max_patches(settings, mt, net, normalized_layer_name, idx_begin, idx_end, + args.N, args.datadir, args.filelist, args.outdir, False, + (args.do_maxes, args.do_deconv, args.do_deconv_norm, args.do_backprop, args.do_backprop_norm, args.do_info)) + if args.search_min: + output_max_patches(settings, mt, net, normalized_layer_name, idx_begin, idx_end, + args.N, args.datadir, args.filelist, args.outdir, True, + (args.do_maxes, args.do_deconv, args.do_deconv_norm, args.do_backprop, args.do_backprop_norm, args.do_info)) if __name__ == '__main__': main() diff --git a/find_maxes/find_max_acts.py b/find_maxes/find_max_acts.py old mode 100644 new mode 100755 index c86eb2285..9a4082db4 --- a/find_maxes/find_max_acts.py +++ b/find_maxes/find_max_acts.py @@ -1,45 +1,108 @@ #! /usr/bin/env python +# this import must comes first to make sure we use the non-display backend +import matplotlib +matplotlib.use('Agg') + +# add parent folder to search path, to enable import of core modules like settings +import os,sys,inspect +currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) +parentdir = os.path.dirname(currentdir) +sys.path.insert(0,parentdir) + import argparse -import ipdb as pdb import cPickle as pickle +import numpy as np + +import settings -from loaders import load_imagenet_mean, load_labels, caffe +from caffevis.caffevis_helper import set_mean from jby_misc import WithTimer -from max_tracker import scan_images_for_maxes +from max_tracker import scan_images_for_maxes, scan_pairs_for_maxes +from settings_misc import load_network + +from misc import mkdir_p + +def pickle_to_text(pickle_filename): + + with open(pickle_filename, 'rb') as pickle_file: + data = pickle.load(pickle_file) + data_dict = data.__dict__.copy() + with open(pickle_filename + ".txt", 'wt') as text_file: + text_file.write(str(data_dict)) + + return def main(): + parser = argparse.ArgumentParser(description='Finds images in a training set that cause max activation for a network; saves results in a pickled NetMaxTracker.') parser.add_argument('--N', type = int, default = 9, help = 'note and save top N activations') - parser.add_argument('--gpu', action = 'store_true', help = 'use gpu') - parser.add_argument('net_prototxt', type = str, default = '', help = 'network prototxt to load') - parser.add_argument('net_weights', type = str, default = '', help = 'network weights to load') - parser.add_argument('datadir', type = str, default = '.', help = 'directory to look for files in') - parser.add_argument('filelist', type = str, help = 'list of image files to consider, one per line') - parser.add_argument('outfile', type = str, help = 'output filename for pkl') - #parser.add_argument('--mean', type = str, default = '', help = 'data mean to load') + parser.add_argument('--gpu', action = 'store_true', default = settings.caffevis_mode_gpu, help = 'use gpu') + parser.add_argument('--net_prototxt', type = str, default = settings.caffevis_deploy_prototxt, help = 'network prototxt to load') + parser.add_argument('--net_weights', type = str, default = settings.caffevis_network_weights, help = 'network weights to load') + parser.add_argument('--datadir', type = str, default = settings.static_files_dir, help = 'directory to look for files in') + parser.add_argument('--outfile', type=str, default = os.path.join(settings.caffevis_outputs_dir, 'find_max_acts_output.pickled'), help='output filename for pkl') + parser.add_argument('--outdir', type = str, default = settings.caffevis_outputs_dir, help = 'Which output directory to use. Files are output into outdir/layer/unit_%%04d/{max_histogram}.png') + parser.add_argument('--do-histograms', action = 'store_true', default = settings.max_tracker_do_histograms, help = 'Output histogram image file containing histogrma of max values per channel') + parser.add_argument('--do-correlation', action = 'store_true', default = settings.max_tracker_do_correlation, help = 'Output correlation image file containing correlation of channels per layer') + parser.add_argument('--search-min', action='store_true', default=False, help='Should we also search for minimal activations?') + args = parser.parse_args() - imagenet_mean = load_imagenet_mean() - net = caffe.Classifier(args.net_prototxt, args.net_weights, - mean=imagenet_mean, - channel_swap=(2,1,0), - raw_scale=255, - image_dims=(256, 256)) + settings.caffevis_deploy_prototxt = args.net_prototxt + settings.caffevis_network_weights = args.net_weights + + net, data_mean = load_network(settings) + + # validate batch size + if settings.is_siamese and settings._calculated_siamese_network_format == 'siamese_batch_pair': + # currently, no batch support for siamese_batch_pair networks + # it can be added by simply handle the batch indexes properly, but it should be thoroughly tested + assert (settings.max_tracker_batch_size == 1) - if args.gpu: - caffe.set_mode_gpu() - else: - caffe.set_mode_cpu() + # set network batch size + current_input_shape = net.blobs[net.inputs[0]].shape + current_input_shape[0] = settings.max_tracker_batch_size + net.blobs[net.inputs[0]].reshape(*current_input_shape) + net.reshape() with WithTimer('Scanning images'): - max_tracker = scan_images_for_maxes(net, args.datadir, args.filelist, args.N) + if settings.is_siamese: + net_max_tracker = scan_pairs_for_maxes(settings, net, args.datadir, args.N, args.outdir, args.search_min) + else: # normal operation + net_max_tracker = scan_images_for_maxes(settings, net, args.datadir, args.N, args.outdir, args.search_min) + + save_max_tracker_to_file(args.outfile, net_max_tracker) + + if args.do_correlation: + net_max_tracker.calculate_correlation(args.outdir) + + if args.do_histograms: + net_max_tracker.calculate_histograms(args.outdir) + + +def save_max_tracker_to_file(filename, net_max_tracker): + + dir_name = os.path.dirname(filename) + mkdir_p(dir_name) + with WithTimer('Saving maxes'): - with open(args.outfile, 'wb') as ff: - pickle.dump(max_tracker, ff, -1) + with open(filename, 'wb') as ff: + pickle.dump(net_max_tracker, ff, -1) + # save text version of pickle file for easier debugging + pickle_to_text(filename) + + +def load_max_tracker_from_file(filename): + + import max_tracker + # load pickle file + with open(filename, 'rb') as tracker_file: + net_max_tracker = pickle.load(tracker_file) + return net_max_tracker if __name__ == '__main__': diff --git a/find_maxes/loaders.py b/find_maxes/loaders.py index a714da701..8d1869406 100644 --- a/find_maxes/loaders.py +++ b/find_maxes/loaders.py @@ -2,25 +2,13 @@ from pylab import * -# Make sure that caffe is on the python path: -caffe_root = '../../' # this file is expected to be in {caffe_root}/experiments/something -import sys -loadpath = caffe_root + 'python_cpu' -print '= = = CAFFE LOADER: LOADING CPU VERSION from path: %s = = =' % loadpath -sys.path.insert(0, loadpath) # Use CPU compiled code for backprop vis/etc -import caffe -caffe.set_mode_cpu() - - - -def load_labels(): - with open('%s/data/ilsvrc12/synset_words.txt' % caffe_root) as ff: +def load_labels(settings): + with open('%s/data/ilsvrc12/synset_words.txt' % settings.caffevis_caffe_root) as ff: labels = [line.strip() for line in ff.readlines()] return labels - -def load_trained_net(model_prototxt = None, model_weights = None): +def load_trained_net(settings, model_prototxt = None, model_weights = None): assert (model_prototxt is None) == (model_weights is None), 'Specify both model_prototxt and model_weights or neither' if model_prototxt is None: load_dir = '/home/jyosinsk/results/140311_234854_afadfd3_priv_netbase_upgraded/' @@ -30,14 +18,23 @@ def load_trained_net(model_prototxt = None, model_weights = None): print 'LOADER: loading net:' print ' ', model_prototxt print ' ', model_weights + + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + if settings.caffevis_mode_gpu: + caffe.set_mode_gpu() + print 'CaffeVisApp mode (in main thread): GPU' + else: + caffe.set_mode_cpu() + print 'CaffeVisApp mode (in main thread): CPU' + net = caffe.Classifier(model_prototxt, model_weights) #net.set_phase_test() return net - -def load_imagenet_mean(): - imagenet_mean = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy') +def load_imagenet_mean(settings): + imagenet_mean = np.load(settings.caffevis_caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy') imagenet_mean = imagenet_mean[:, 14:14+227, 14:14+227] # (3,256,256) -> (3,227,227) Crop to center 227x227 section return imagenet_mean diff --git a/find_maxes/max_tracker.py b/find_maxes/max_tracker.py index 8bfd427de..5b8dfbb56 100644 --- a/find_maxes/max_tracker.py +++ b/find_maxes/max_tracker.py @@ -1,158 +1,619 @@ #! /usr/bin/env python -import os -import ipdb as pdb import errno +import os +import sys from datetime import datetime +import cv2 -#import caffe -from loaders import load_imagenet_mean, load_labels, caffe -from jby_misc import WithTimer -from caffe_misc import shownet, RegionComputer, save_caffe_image import numpy as np +from misc import mkdir_p, get_files_list +from image_misc import resize_without_fit +from caffe_misc import RegionComputer, save_caffe_image, get_max_data_extent, extract_patch_from_image, \ + compute_data_layer_focus_area, layer_name_to_top_name +from siamese_helper import SiameseHelper +from jby_misc import WithTimer +# define records -default_layers = ['conv1', 'conv2', 'conv3', 'conv4', 'conv5', 'fc6', 'fc7', 'fc8', 'prob'] -default_is_conv = [('conv' in ll) for ll in default_layers] -def hardcoded_get(): - prototxt = '/home/jyosinsk/results/140311_234854_afadfd3_priv_netbase_upgraded/deploy_1.prototxt' - weights = '/home/jyosinsk/results/140311_234854_afadfd3_priv_netbase_upgraded/caffe_imagenet_train_iter_450000' - datadir = '/home/jyosinsk/imagenet2012/val' - filelist = 'mini_valid.txt' +class MaxTrackerBatchRecord(object): - imagenet_mean = load_imagenet_mean() - net = caffe.Classifier(prototxt, weights, - mean=imagenet_mean, - channel_swap=(2,1,0), - raw_scale=255, - image_dims=(256, 256)) - net.set_phase_test() - net.set_mode_cpu() - labels = load_labels() + def __init__(self, image_idx = None, filename = None, im = None): + self.image_idx = image_idx + self.filename = filename + self.im = im - return net, imagenet_mean, labels, datadir, filelist +class MaxTrackerCropBatchRecord(object): + + def __init__(self, cc = None, channel_idx = None, info_filename = None, maxim_filenames = None, + deconv_filenames = None, deconvnorm_filenames = None, backprop_filenames = None, + backpropnorm_filenames = None, info_file = None, max_idx_0 = None, max_idx = None, im_idx = None, + selected_input_index = None, ii = None, jj = None, recorded_val = None, + out_ii_start = None, out_ii_end = None, out_jj_start = None, out_jj_end = None, data_ii_start = None, + data_ii_end = None, data_jj_start = None, data_jj_end = None, im = None, + denormalized_layer_name = None, denormalized_top_name = None, layer_format = None): + self.cc = cc + self.channel_idx = channel_idx + self.info_filename = info_filename + self.maxim_filenames = maxim_filenames + self.deconv_filenames = deconv_filenames + self.deconvnorm_filenames = deconvnorm_filenames + self.backprop_filenames = backprop_filenames + self.backpropnorm_filenames = backpropnorm_filenames + self.info_file = info_file + self.max_idx_0 = max_idx_0 + self.max_idx = max_idx + self.im_idx = im_idx + self.selected_input_index = selected_input_index + self.ii = ii + self.jj = jj + self.recorded_val = recorded_val + self.out_ii_start = out_ii_start + self.out_ii_end = out_ii_end + self.out_jj_start = out_jj_start + self.out_jj_end = out_jj_end + self.data_ii_start = data_ii_start + self.data_ii_end = data_ii_end + self.data_jj_start = data_jj_start + self.data_jj_end = data_jj_end + self.im = im + self.denormalized_layer_name = denormalized_layer_name + self.denormalized_top_name = denormalized_top_name + self.layer_format = layer_format -def mkdir_p(path): - # From https://stackoverflow.com/questions/600268/mkdir-p-functionality-in-python - try: - os.makedirs(path) - except OSError as exc: # Python >2.5 - if exc.errno == errno.EEXIST and os.path.isdir(path): - pass - else: - raise +class InfoFileMetadata(object): + + def __init__(self, info_file = None, ref_count = None): + self.info_file = info_file + self.ref_count = ref_count + + +def prepare_max_histogram(layer_name, n_channels, channel_to_histogram_values, process_channel_figure, process_layer_figure): + + import matplotlib.pyplot as plt + + fig = plt.figure(figsize=(10, 10)) + ax = fig.add_subplot(111) + + # for each channel + percent_dead = np.zeros((n_channels), dtype=np.float32) + for channel_idx in xrange(n_channels): + + if channel_idx % 100 == 0: + print "calculating histogram for channel %d out of %d" % (channel_idx, n_channels) + + hist, bin_edges = channel_to_histogram_values(channel_idx) + + # generate histogram image file + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + + barlist = ax.bar(center, hist, align='center', width=width, color='g') + + for i in range(len(hist)): + if 0 >= bin_edges[i] and 0 < bin_edges[i+1]: + # mark dead bar in red + barlist[i].set_color('r') + + # save percent dead + percent_dead[channel_idx] = 100.0 * hist[i] / sum(hist) + + break + + fig.suptitle('max activations histgoram of layer %s channel %d' % (layer_name,channel_idx)) + ax.xaxis.label.set_text('max activation value') + ax.yaxis.label.set_text('inputs count') + + process_channel_figure(channel_idx, fig) + + ax.cla() + + # generate histogram for layer + num_bins = 20 + hist, bin_edges = np.histogram(100 - percent_dead, bins=num_bins, range=(0, 100)) + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + + bar_colors = [None] * num_bins + begin_color = np.array([1.0, 0, 0]) + end_color = np.array([0, 1.0, 0]) + color_step = (end_color - begin_color) / (num_bins - 1) + current_color = begin_color + for i in range(num_bins): + bar_colors[i] = tuple(current_color) + current_color += color_step + ax.bar(center, hist, align='center', width=width, color=bar_colors) + + fig.suptitle('activity of layer %s' % (layer_name)) + ax.xaxis.label.set_text('activity percent') + ax.yaxis.label.set_text('channels count') + + process_layer_figure(fig) + + fig.clf() + plt.close(fig) + + pass class MaxTracker(object): - def __init__(self, is_conv, n_channels, n_top = 10, initial_val = -1e99, dtype = 'float32'): - self.is_conv = is_conv - self.max_vals = np.ones((n_channels, n_top), dtype = dtype) * initial_val + + def __init__(self, is_spatial, n_channels, n_top = 10, initial_val = -1e99, dtype = 'float32', search_min = False): + self.is_spatial = is_spatial self.n_top = n_top - if is_conv: - self.max_locs = -np.ones((n_channels, n_top, 4), dtype = 'int') # image_idx, image_class, i, j + self.search_min = search_min + + self.max_vals = np.ones((n_channels, n_top), dtype=dtype) * initial_val + if is_spatial: + self.max_locs = -np.ones((n_channels, n_top, 4), dtype = 'int') # image_idx, selected_input_index, i, j else: - self.max_locs = -np.ones((n_channels, n_top, 2), dtype = 'int') # image_idx, image_class + self.max_locs = -np.ones((n_channels, n_top, 2), dtype = 'int') # image_idx, selected_input_index + + if self.search_min: + self.min_vals = np.ones((n_channels, n_top), dtype=dtype) * (-initial_val) + if is_spatial: + self.min_locs = -np.ones((n_channels, n_top, 4), dtype='int') # image_idx, selected_input_index, i, j + else: + self.min_locs = -np.ones((n_channels, n_top, 2), dtype='int') # image_idx, selected_input_index + + # set of seen inputs, used to avoid updating on the same input twice + self.seen_inputs = set() + + # will hold a list of np array, each containing the max values of all the channels for one input + self.all_max_vals = list() + + # keeps a map between channel index and histogram values + self.channel_to_histogram = [None] * n_channels + + def __getstate__(self): + # Copy the object's state from self.__dict__ which contains + # all our instance attributes. Always use the dict.copy() + # method to avoid modifying the original state. + state = self.__dict__.copy() + # Remove the unpicklable entries. + del state['seen_inputs'] + del state['all_max_vals'] + return state + + def __setstate__(self, state): + # Restore instance attributes (i.e., filename and lineno). + self.__dict__.update(state) + + self.seen_inputs = None + self.all_max_vals = None + + def __repr__(self): + return str(self.__dict__.copy()) + + def update(self, data, image_idx, selected_input_index, layer_unique_input_source, layer_name): + + # if unique_input_source already exist, we can skip the update since we've already seen it + if layer_unique_input_source in self.seen_inputs: + return + + # add input identifier to seen inputs set + self.seen_inputs.add(layer_unique_input_source) - def update(self, blob, image_idx, image_class): - data = blob[0] # Note: makes a copy of blob, e.g. (96,55,55) n_channels = data.shape[0] - data_unroll = data.reshape((n_channels, -1)) # Note: no copy eg (96,3025). Does nothing if not is_conv + data_unroll = data.reshape((n_channels, -1)) # Note: no copy eg (96,3025). Does nothing if not is_spatial + + max_indexes = data_unroll.argmax(1) # maxes for each channel, eg. (96,) - maxes = data_unroll.argmax(1) # maxes for each channel, eg. (96,) + # add maxes for all channels to a list, bounded to avoid consuming too much memory + maxes = data_unroll[range(n_channels), max_indexes] + MAX_LIST_SIZE = 10000 + if len(self.all_max_vals) < MAX_LIST_SIZE: + self.all_max_vals.append(maxes) + + produced_warning = False #insertion_idx = zeros((n_channels,)) #pdb.set_trace() for ii in xrange(n_channels): - idx = np.searchsorted(self.max_vals[ii], data_unroll[ii, maxes[ii]]) - if idx == 0: - # Smaller than all 10 + + max_value = data_unroll[ii, max_indexes[ii]] + + # skip nan + if np.isnan(max_value): + # only warn once + if not produced_warning: + print 'WARNING: got NAN activation on input', str(layer_unique_input_source) + produced_warning = True continue - # Store new max in the proper order. Update both arrays: - # self.max_vals: - self.max_vals[ii,:idx-1] = self.max_vals[ii,1:idx] # shift lower values - self.max_vals[ii,idx-1] = data_unroll[ii, maxes[ii]] # store new max value - # self.max_locs - self.max_locs[ii,:idx-1] = self.max_locs[ii,1:idx] # shift lower location data - # store new location - if self.is_conv: - self.max_locs[ii,idx-1] = (image_idx, image_class) + np.unravel_index(maxes[ii], data.shape[1:]) - else: - self.max_locs[ii,idx-1] = (image_idx, image_class) + idx = np.searchsorted(self.max_vals[ii], max_value) + # if not smaller than all elements + if idx != 0: + # Store new value in the proper order. Update both arrays: + # self.max_vals: + self.max_vals[ii,:idx-1] = self.max_vals[ii,1:idx] # shift lower values + self.max_vals[ii,idx-1] = max_value # store new max value + # self.max_locs + self.max_locs[ii,:idx-1] = self.max_locs[ii,1:idx] # shift lower location data + # store new location + if self.is_spatial: + self.max_locs[ii,idx-1] = (image_idx, selected_input_index) + np.unravel_index(max_indexes[ii], data.shape[1:]) + else: + self.max_locs[ii,idx-1] = (image_idx, selected_input_index) + + if self.search_min: + idx = np.searchsorted(self.min_vals[ii], max_value) + # if not bigger than all elements + if idx != self.n_top: + # Store new value in the proper order. Update both arrays: + # self.min_vals: + self.min_vals[ii, (idx+1):(self.n_top)] = self.min_vals[ii, idx:(self.n_top-1)] # shift upper values + self.min_vals[ii, idx] = max_value # store new value + # self.min_locs + self.min_locs[ii, (idx+1):(self.n_top)] = self.min_locs[ii, idx:(self.n_top-1)] # shift upper location data + # store new location + if self.is_spatial: + self.min_locs[ii, idx] = (image_idx, selected_input_index) + np.unravel_index(max_indexes[ii], data.shape[1:]) + else: + self.min_locs[ii, idx] = (image_idx, selected_input_index) + + def calculate_histogram(self, layer_name, outdir): + + # convert list of arrays to numpy array + all_max_array = np.vstack(self.all_max_vals) + + def channel_to_histogram_values(channel_idx): + # get values + max_for_single_channel = all_max_array[:, channel_idx] + + # create histogram + hist, bin_edges = np.histogram(max_for_single_channel, bins=50) + + # save histogram values + self.channel_to_histogram[channel_idx] = (hist, bin_edges) + + return hist, bin_edges + + def process_channel_figure(channel_idx, fig): + unit_dir = os.path.join(outdir, layer_name, 'unit_%04d' % channel_idx) + mkdir_p(unit_dir) + filename = os.path.join(unit_dir, 'max_histogram.png') + fig.savefig(filename) + pass + + def process_layer_figure(fig): + filename = os.path.join(outdir, layer_name, 'layer_inactivity.png') + fig.savefig(filename) + pass + + n_channels = all_max_array.shape[1] + prepare_max_histogram(layer_name, n_channels, channel_to_histogram_values, process_channel_figure, process_layer_figure) + + pass + + def calculate_correlation(self, layer_name, outdir): + + # convert list of arrays to numpy array + all_max_array = np.vstack(self.all_max_vals) + + # skip layers with only one channel + if all_max_array.shape[1] == 1: + return + + corr = np.corrcoef(all_max_array.transpose()) + + # fix possible NANs + corr = np.nan_to_num(corr) + np.fill_diagonal(corr, 1) + + # sort correlation matrix + # import cPickle as pickle + # with open('corr_%s.pickled' % layer_name, 'wb') as ff: + # pickle.dump(corr, ff, protocol=2) + + # alternative sorting + # values = np.dot(corr, np.arange(corr.shape[0])) + # indexes = np.argsort(values) + + indexes = np.lexsort(corr) + sorted_corr = corr[indexes,:][:,indexes] + + # plot correlation matrix + import matplotlib.pyplot as plt + fig = plt.figure(figsize=(10, 10)) + plt.subplot(1, 1, 1) + plt.imshow(sorted_corr, interpolation='nearest', vmin=-1, vmax=1) + plt.colorbar() + plt.title('channels activations correlation matrix for layer %s' % (layer_name)) + plt.tight_layout() + + # save correlation matrix + layer_dir = os.path.join(outdir, layer_name) + mkdir_p(layer_dir) + filename = os.path.join(layer_dir, 'channels_correlation.png') + fig.savefig(filename, bbox_inches='tight') + + plt.close() + + return class NetMaxTracker(object): - def __init__(self, layers = default_layers, is_conv = default_is_conv, n_top = 10, initial_val = -1e99, dtype = 'float32'): + def __init__(self, settings, layers, n_top = 10, initial_val = -1e99, dtype = 'float32', search_min = False): self.layers = layers - self.is_conv = is_conv self.init_done = False self.n_top = n_top + self.search_min = search_min self.initial_val = initial_val + self.settings = settings + self.siamese_helper = SiameseHelper(self.settings.layers_list) def _init_with_net(self, net): self.max_trackers = {} - for layer,is_conv in zip(self.layers, self.is_conv): - blob = net.blobs[layer].data - self.max_trackers[layer] = MaxTracker(is_conv, blob.shape[1], n_top = self.n_top, - initial_val = self.initial_val, - dtype = blob.dtype) + for layer_name in self.layers: + + print 'init layer: ', layer_name + top_name = layer_name_to_top_name(net, layer_name) + blob = net.blobs[top_name].data + + # normalize layer name, this is used for siamese networks where we want layers "conv_1" and "conv_1_p" to + # count as the same layer in terms of activations + normalized_layer_name = self.siamese_helper.normalize_layer_name_for_max_tracker(layer_name) + + is_spatial = (len(blob.shape) == 4) + + # only add normalized layer once + if normalized_layer_name not in self.max_trackers: + self.max_trackers[normalized_layer_name] = MaxTracker(is_spatial, blob.shape[1], n_top = self.n_top, + initial_val = self.initial_val, + dtype = blob.dtype, search_min = self.search_min) + self.init_done = True - - def update(self, net, image_idx, image_class): + + def update(self, net, image_idx, net_unique_input_source, batch_index): '''Updates the maxes found so far with the state of the given net. If a new max is found, it is stored together with the image_idx.''' + if not self.init_done: self._init_with_net(net) - for layer in self.layers: - blob = net.blobs[layer].data - self.max_trackers[layer].update(blob, image_idx, image_class) + for layer_name in self.layers: + + # print "processing layer %s" % layer_name + + top_name = layer_name_to_top_name(net, layer_name) + blob = net.blobs[top_name].data + + normalized_layer_name = self.siamese_helper.normalize_layer_name_for_max_tracker(layer_name) + + layer_format = self.siamese_helper.get_layer_format_by_layer_name(layer_name) + + # in siamese network, implemented as pairs of layers, we might need to select one of the images from the siamese pair + if self.settings.is_siamese and layer_format == 'siamese_layer_pair': + selected_input_index = self.siamese_helper.get_index_of_saved_image_by_layer_name(layer_name) + + if selected_input_index == 0: + # first image identifier is selected + self.max_trackers[normalized_layer_name].update(blob[batch_index], image_idx, selected_input_index, + net_unique_input_source[0], layer_name) + + elif selected_input_index == 1: + # second image identifier is selected + self.max_trackers[normalized_layer_name].update(blob[batch_index], image_idx, selected_input_index, + net_unique_input_source[1], layer_name) + + elif selected_input_index == -1: + # both images are selected + self.max_trackers[normalized_layer_name].update(blob[batch_index], image_idx, selected_input_index, + net_unique_input_source, layer_name) + + # in siamese network, implemented as single layer with batch 2, we might need to select one of the images from the siamese pair + elif self.settings.is_siamese and layer_format == 'siamese_batch_pair': + + assert (self.settings.max_tracker_batch_size == 1) + + # if batch size is 2, then we have two outputs in this layer, so we update both of them + if blob.shape[0] == 2: + + # update first output + self.max_trackers[normalized_layer_name].update(blob[0], image_idx, 0, + net_unique_input_source[0], layer_name) + + # update second output + self.max_trackers[normalized_layer_name].update(blob[1], image_idx, 1, + net_unique_input_source[1], layer_name) + + # we have single output + elif blob.shape[0] == 1: + self.max_trackers[normalized_layer_name].update(blob[0], image_idx, -1, + net_unique_input_source, layer_name) + + else: # normal, non-siamese network + self.max_trackers[normalized_layer_name].update(blob[batch_index], image_idx, -1, net_unique_input_source, layer_name) + + pass + + def calculate_histograms(self, outdir): + + print "calculate_histograms on network" + for layer_name in self.layers: + print "calculate_histogram on layer %s" % layer_name + + # normalize layer name, this is used for siamese networks where we want layers "conv_1" and "conv_1_p" to + # count as the same layer in terms of activations + normalized_layer_name = self.siamese_helper.normalize_layer_name_for_max_tracker(layer_name) + + self.max_trackers[normalized_layer_name].calculate_histogram(layer_name, outdir) + + pass + + def calculate_correlation(self, outdir): + + print "calculate_correlation on network" + for layer_name in self.layers: + print "calculate_correlation on layer %s" % layer_name + + # normalize layer name, this is used for siamese networks where we want layers "conv_1" and "conv_1_p" to + # count as the same layer in terms of activations + normalized_layer_name = self.siamese_helper.normalize_layer_name_for_max_tracker(layer_name) + + self.max_trackers[normalized_layer_name].calculate_correlation(layer_name, outdir) + pass -def load_file_list(filelist): - image_filenames = [] - image_labels = [] - with open(filelist, 'r') as ff: - for line in ff.readlines(): - fields = line.strip().split() - image_filenames.append(fields[0]) - image_labels.append(int(fields[1])) - return image_filenames, image_labels + def __getstate__(self): + # Copy the object's state from self.__dict__ which contains + # all our instance attributes. Always use the dict.copy() + # method to avoid modifying the original state. + state = self.__dict__.copy() + # Remove the unpicklable entries. + del state['settings'] + del state['siamese_helper'] + return state + def __setstate__(self, state): + # Restore instance attributes (i.e., filename and lineno). + self.__dict__.update(state) -def scan_images_for_maxes(net, datadir, filelist, n_top): - image_filenames, image_labels = load_file_list(filelist) + self.settings = None + self.siamese_helper = None + +def scan_images_for_maxes(settings, net, datadir, n_top, outdir, search_min): + image_filenames, image_labels = get_files_list(settings) print 'Scanning %d files' % len(image_filenames) print ' First file', os.path.join(datadir, image_filenames[0]) - tracker = NetMaxTracker(n_top = n_top) + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + + tracker = NetMaxTracker(settings, n_top = n_top, layers=settings.layers_to_output_in_offline_scripts, search_min=search_min) + + net_input_dims = net.blobs['data'].data.shape[2:4] + + # prepare variables used for batches + batch = [None] * settings.max_tracker_batch_size + for i in range(0, settings.max_tracker_batch_size): + batch[i] = MaxTrackerBatchRecord() + + batch_index = 0 + for image_idx in xrange(len(image_filenames)): - filename = image_filenames[image_idx] - image_class = image_labels[image_idx] - #im = caffe.io.load_image('../../data/ilsvrc12/mini_ilsvrc_valid/sized/ILSVRC2012_val_00000610.JPEG') - do_print = (image_idx % 100 == 0) + + batch[batch_index].image_idx = image_idx + batch[batch_index].filename = image_filenames[image_idx] + + do_print = (batch[batch_index].image_idx % 100 == 0) if do_print: - print '%s Image %d/%d' % (datetime.now().ctime(), image_idx, len(image_filenames)) + print '%s Image %d/%d' % (datetime.now().ctime(), batch[batch_index].image_idx, len(image_filenames)) + with WithTimer('Load image', quiet = not do_print): - im = caffe.io.load_image(os.path.join(datadir, filename)) - with WithTimer('Predict ', quiet = not do_print): - net.predict([im], oversample = False) # Just take center crop - with WithTimer('Update ', quiet = not do_print): - tracker.update(net, image_idx, image_class) + try: + batch[batch_index].im = caffe.io.load_image(os.path.join(datadir, batch[batch_index].filename), color=not settings._calculated_is_gray_model) + batch[batch_index].im = resize_without_fit(batch[batch_index].im, net_input_dims) + batch[batch_index].im = batch[batch_index].im.astype(np.float32) + except: + # skip bad/missing inputs + print "WARNING: skipping bad/missing input:", batch[batch_index].filename + continue + + batch_index += 1 + + # if current batch is full + if batch_index == settings.max_tracker_batch_size \ + or image_idx == len(image_filenames) - 1: # or last iteration + + # batch predict + with WithTimer('Predict on batch ', quiet = not do_print): + im_batch = [record.im for record in batch] + net.predict(im_batch, oversample = False) # Just take center crop + + # go over batch and update statistics + for i in range(0,batch_index): + + with WithTimer('Update ', quiet = not do_print): + tracker.update(net, batch[i].image_idx, net_unique_input_source=batch[i].filename, batch_index=i) + + batch_index = 0 print 'done!' return tracker +def scan_pairs_for_maxes(settings, net, datadir, n_top, outdir, search_min): + image_filenames, image_labels = get_files_list(settings) + print 'Scanning %d pairs' % len(image_filenames) + print ' First pair', image_filenames[0] + + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + + tracker = NetMaxTracker(settings, n_top=n_top, layers=settings.layers_to_output_in_offline_scripts, search_min=search_min) + + net_input_dims = net.blobs['data'].data.shape[2:4] + + # prepare variables used for batches + batch = [None] * settings.max_tracker_batch_size + for i in range(0, settings.max_tracker_batch_size): + batch[i] = MaxTrackerBatchRecord() -def save_representations(net, datadir, filelist, layer, first_N = None): - image_filenames, image_labels = load_file_list(filelist) + batch_index = 0 + + for image_idx in xrange(len(image_filenames)): + + batch[batch_index].image_idx = image_idx + batch[batch_index].images_pair = image_filenames[image_idx] + filename1 = batch[batch_index].images_pair[0] + filename2 = batch[batch_index].images_pair[1] + + do_print = (image_idx % 100 == 0) + if do_print: + print '%s Pair %d/%d' % (datetime.now().ctime(), batch[batch_index].image_idx, len(image_filenames)) + + with WithTimer('Load image', quiet=not do_print): + try: + im1 = caffe.io.load_image(os.path.join(datadir, filename1), color=not settings._calculated_is_gray_model) + im2 = caffe.io.load_image(os.path.join(datadir, filename2), color=not settings._calculated_is_gray_model) + + if settings.siamese_input_mode == 'concat_channelwise': + im1 = resize_without_fit(im1, net_input_dims) + im2 = resize_without_fit(im2, net_input_dims) + batch[batch_index].im = np.concatenate((im1, im2), axis=2) + + elif settings.siamese_input_mode == 'concat_along_width': + half_input_dims = (net_input_dims[0], net_input_dims[1] / 2) + im1 = resize_without_fit(im1, half_input_dims) + im2 = resize_without_fit(im2, half_input_dims) + batch[batch_index].im = np.concatenate((im1, im2), axis=1) + + except: + # skip bad/missing inputs + print "WARNING: skipping bad/missing inputs:", filename1, filename2 + continue + + batch_index += 1 + + # if current batch is full + if batch_index == settings.max_tracker_batch_size \ + or image_idx == len(image_filenames) - 1: # or last iteration + + with WithTimer('Predict ', quiet=not do_print): + im_batch = [record.im for record in batch] + net.predict(im_batch, oversample=False) + + # go over batch and update statistics + for i in range(0,batch_index): + with WithTimer('Update ', quiet=not do_print): + tracker.update(net, batch[i].image_idx, net_unique_input_source=batch[i].images_pair, batch_index=i) + + batch_index = 0 + + print 'done!' + return tracker + + +def save_representations(settings, net, datadir, filelist, layer_name, first_N = None): + image_filenames, image_labels = get_files_list(filelist) if first_N is None: first_N = len(image_filenames) assert first_N <= len(image_filenames) @@ -161,188 +622,396 @@ def save_representations(net, datadir, filelist, layer, first_N = None): assert len(image_indices) > 0 print ' First file', os.path.join(datadir, image_filenames[image_indices[0]]) + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + indices = None rep = None for ii,image_idx in enumerate(image_indices): filename = image_filenames[image_idx] - image_class = image_labels[image_idx] do_print = (image_idx % 10 == 0) if do_print: print '%s Image %d/%d' % (datetime.now().ctime(), image_idx, len(image_indices)) with WithTimer('Load image', quiet = not do_print): - im = caffe.io.load_image(os.path.join(datadir, filename)) + im = caffe.io.load_image(os.path.join(datadir, filename), color=not settings._calculated_is_gray_model) with WithTimer('Predict ', quiet = not do_print): net.predict([im], oversample = False) # Just take center crop with WithTimer('Store ', quiet = not do_print): + top_name = layer_name_to_top_name(net, layer_name) if rep is None: - rep_shape = net.blobs[layer].data[0].shape # e.g. (256,13,13) + rep_shape = net.blobs[top_name].data[0].shape # e.g. (256,13,13) rep = np.zeros((len(image_indices),) + rep_shape) # e.g. (1000,256,13,13) indices = [0] * len(image_indices) indices[ii] = image_idx - rep[ii] = net.blobs[layer].data[0] + rep[ii] = net.blobs[top_name].data[0] print 'done!' return indices,rep +def generate_output_names(unit_dir, num_top, do_info, do_maxes, do_deconv, do_deconv_norm, do_backprop, do_backprop_norm, search_min): -def get_max_data_extent(net, layer, rc, is_conv): - '''Gets the maximum size of the data layer that can influence a unit on layer.''' - if is_conv: - conv_size = net.blobs[layer].data.shape[2:4] # e.g. (13,13) for conv5 - layer_slice_middle = (conv_size[0]/2,conv_size[0]/2+1, conv_size[1]/2,conv_size[1]/2+1) # e.g. (6,7,6,7,), the single center unit - data_slice = rc.convert_region(layer, 'data', layer_slice_middle) - return data_slice[1]-data_slice[0], data_slice[3]-data_slice[2] # e.g. (163, 163) for conv5 - else: - # Whole data region - return net.blobs['data'].data.shape[2:4] # e.g. (227,227) for fc6,fc7,fc8,prop + # init values + info_filename = [] + maxim_filenames = [] + deconv_filenames = [] + deconvnorm_filenames = [] + backprop_filenames = [] + backpropnorm_filenames = [] + + prefix = 'min_' if search_min else '' + + if do_info: + info_filename = [os.path.join(unit_dir, prefix + 'info.txt')] + + for max_idx_0 in range(num_top): + if do_maxes: + maxim_filenames.append(os.path.join(unit_dir, prefix + 'maxim_%03d.png' % max_idx_0)) + if do_deconv: + deconv_filenames.append(os.path.join(unit_dir, prefix + 'deconv_%03d.png' % max_idx_0)) + if do_deconv_norm: + deconvnorm_filenames.append(os.path.join(unit_dir, prefix + 'deconvnorm_%03d.png' % max_idx_0)) -def output_max_patches(max_tracker, net, layer, idx_begin, idx_end, num_top, datadir, filelist, outdir, do_which): + if do_backprop: + backprop_filenames.append(os.path.join(unit_dir, prefix + 'backprop_%03d.png' % max_idx_0)) + + if do_backprop_norm: + backpropnorm_filenames.append(os.path.join(unit_dir, prefix + 'backpropnorm_%03d.png' % max_idx_0)) + + return (info_filename, maxim_filenames, deconv_filenames, deconvnorm_filenames, backprop_filenames, backpropnorm_filenames) + + +def output_max_patches(settings, max_tracker, net, layer_name, idx_begin, idx_end, num_top, datadir, filelist, outdir, search_min, do_which): do_maxes, do_deconv, do_deconv_norm, do_backprop, do_backprop_norm, do_info = do_which assert do_maxes or do_deconv or do_deconv_norm or do_backprop or do_backprop_norm or do_info, 'nothing to do' + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + mt = max_tracker - rc = RegionComputer() - - image_filenames, image_labels = load_file_list(filelist) - print 'Loaded filenames and labels for %d files' % len(image_filenames) - print ' First file', os.path.join(datadir, image_filenames[0]) - num_top_in_mt = mt.max_locs.shape[1] + locs = mt.min_locs if search_min else mt.max_locs + vals = mt.min_vals if search_min else mt.max_vals + + image_filenames, image_labels = get_files_list(settings) + + if settings.is_siamese: + print 'Loaded filenames and labels for %d pairs' % len(image_filenames) + print ' First pair', image_filenames[0] + else: + print 'Loaded filenames and labels for %d files' % len(image_filenames) + print ' First file', os.path.join(datadir, image_filenames[0]) + + siamese_helper = SiameseHelper(settings.layers_list) + + num_top_in_mt = locs.shape[1] assert num_top <= num_top_in_mt, 'Requested %d top images but MaxTracker contains only %d' % (num_top, num_top_in_mt) assert idx_end >= idx_begin, 'Range error' - size_ii, size_jj = get_max_data_extent(net, layer, rc, mt.is_conv) + # minor fix for backwards compatability + if hasattr(mt, 'is_conv'): + mt.is_spatial = mt.is_conv + + size_ii, size_jj = get_max_data_extent(net, settings, layer_name, mt.is_spatial) data_size_ii, data_size_jj = net.blobs['data'].data.shape[2:4] - + + net_input_dims = net.blobs['data'].data.shape[2:4] + + # prepare variables used for batches + batch = [None] * settings.max_tracker_batch_size + for i in range(0, settings.max_tracker_batch_size): + batch[i] = MaxTrackerCropBatchRecord() + + batch_index = 0 + + channel_to_info_file = dict() + n_total_images = (idx_end-idx_begin) * num_top for cc, channel_idx in enumerate(range(idx_begin, idx_end)): - unit_dir = os.path.join(outdir, layer, 'unit_%04d' % channel_idx) + + unit_dir = os.path.join(outdir, layer_name, 'unit_%04d' % channel_idx) mkdir_p(unit_dir) + # check if all required outputs exist, in which case skip this iteration + [info_filename, + maxim_filenames, + deconv_filenames, + deconvnorm_filenames, + backprop_filenames, + backpropnorm_filenames] = generate_output_names(unit_dir, num_top, do_info, do_maxes, do_deconv, do_deconv_norm, do_backprop, do_backprop_norm, search_min) + + relevant_outputs = info_filename + \ + maxim_filenames + \ + deconv_filenames + \ + deconvnorm_filenames + \ + backprop_filenames + \ + backpropnorm_filenames + + # we skip generation if: + # 1. all outputs exist, AND + # 2.1. (not last iteration OR + # 2.2. last iteration, but batch is empty) + relevant_outputs_exist = [os.path.exists(file_name) for file_name in relevant_outputs] + if all(relevant_outputs_exist) and \ + ((channel_idx != idx_end - 1) or ((channel_idx == idx_end - 1) and (batch_index == 0))): + print "skipped generation of channel %d in layer %s since files already exist" % (channel_idx, layer_name) + continue + if do_info: - info_filename = os.path.join(unit_dir, 'info.txt') - info_file = open(info_filename, 'w') - print >>info_file, '# is_conv val image_idx image_class i(if is_conv) j(if is_conv) filename' + channel_to_info_file[channel_idx] = InfoFileMetadata() + channel_to_info_file[channel_idx].info_file = open(info_filename[0], 'w') + channel_to_info_file[channel_idx].ref_count = num_top + + print >> channel_to_info_file[channel_idx].info_file, '# is_spatial val image_idx selected_input_index i(if is_spatial) j(if is_spatial) filename' # iterate through maxes from highest (at end) to lowest for max_idx_0 in range(num_top): - max_idx = num_top_in_mt - 1 - max_idx_0 - if mt.is_conv: - im_idx, im_class, ii, jj = mt.max_locs[channel_idx, max_idx] + batch[batch_index].cc = cc + batch[batch_index].channel_idx = channel_idx + batch[batch_index].info_filename = info_filename + batch[batch_index].maxim_filenames = maxim_filenames + batch[batch_index].deconv_filenames = deconv_filenames + batch[batch_index].deconvnorm_filenames = deconvnorm_filenames + batch[batch_index].backprop_filenames = backprop_filenames + batch[batch_index].backpropnorm_filenames = backpropnorm_filenames + batch[batch_index].info_file = channel_to_info_file[channel_idx].info_file + + batch[batch_index].max_idx_0 = max_idx_0 + batch[batch_index].max_idx = num_top_in_mt - 1 - batch[batch_index].max_idx_0 + + if mt.is_spatial: + + # fix for backward compatability + if locs.shape[2] == 5: + # remove second column + locs = np.delete(locs, 1, 2) + + batch[batch_index].im_idx, batch[batch_index].selected_input_index, batch[batch_index].ii, batch[batch_index].jj = locs[batch[batch_index].channel_idx, batch[batch_index].max_idx] else: - im_idx, im_class = mt.max_locs[channel_idx, max_idx] - recorded_val = mt.max_vals[channel_idx, max_idx] - filename = image_filenames[im_idx] - do_print = (max_idx_0 == 0) - if do_print: - print '%s Output file/image(s) %d/%d' % (datetime.now().ctime(), cc * num_top, n_total_images) + # fix for backward compatability + if locs.shape[2] == 3: + # remove second column + locs = np.delete(locs, 1, 2) - if mt.is_conv: - # Compute the focus area of the data layer - layer_indices = (ii,ii+1,jj,jj+1) - data_indices = rc.convert_region(layer, 'data', layer_indices) - data_ii_start,data_ii_end,data_jj_start,data_jj_end = data_indices + batch[batch_index].im_idx, batch[batch_index].selected_input_index = locs[batch[batch_index].channel_idx, batch[batch_index].max_idx] + batch[batch_index].ii, batch[batch_index].jj = 0, 0 - touching_imin = (data_ii_start == 0) - touching_jmin = (data_jj_start == 0) + # if ii and jj are invalid then there is no data for this "top" image, so we can skip it + if (batch[batch_index].ii, batch[batch_index].jj) == (-1,-1): + continue - # Compute how much of the data slice falls outside the actual data [0,max] range - ii_outside = size_ii - (data_ii_end - data_ii_start) # possibly 0 - jj_outside = size_jj - (data_jj_end - data_jj_start) # possibly 0 + batch[batch_index].recorded_val = vals[batch[batch_index].channel_idx, batch[batch_index].max_idx] + batch[batch_index].filename = image_filenames[batch[batch_index].im_idx] + do_print = (batch[batch_index].max_idx_0 == 0) + if do_print: + print '%s Output file/image(s) %d/%d layer %s channel %d' % (datetime.now().ctime(), batch[batch_index].cc * num_top, n_total_images, layer_name, batch[batch_index].channel_idx) - if touching_imin: - out_ii_start = ii_outside - out_ii_end = size_ii - else: - out_ii_start = 0 - out_ii_end = size_ii - ii_outside - if touching_jmin: - out_jj_start = jj_outside - out_jj_end = size_jj - else: - out_jj_start = 0 - out_jj_end = size_jj - jj_outside - else: - ii,jj = 0,0 - data_ii_start, out_ii_start, data_jj_start, out_jj_start = 0,0,0,0 - data_ii_end, out_ii_end, data_jj_end, out_jj_end = size_ii, size_ii, size_jj, size_jj + # print "DEBUG: (mt.is_spatial, batch[batch_index].ii, batch[batch_index].jj, layer_name, size_ii, size_jj, data_size_ii, data_size_jj)", str((mt.is_spatial, batch[batch_index].ii, batch[batch_index].jj, rc, layer_name, size_ii, size_jj, data_size_ii, data_size_jj)) + + [batch[batch_index].out_ii_start, + batch[batch_index].out_ii_end, + batch[batch_index].out_jj_start, + batch[batch_index].out_jj_end, + batch[batch_index].data_ii_start, + batch[batch_index].data_ii_end, + batch[batch_index].data_jj_start, + batch[batch_index].data_jj_end] = \ + compute_data_layer_focus_area(mt.is_spatial, batch[batch_index].ii, batch[batch_index].jj, settings, layer_name, + size_ii, size_jj, data_size_ii, data_size_jj) + + # print "DEBUG: channel:%d out_ii_start:%d out_ii_end:%d out_jj_start:%d out_jj_end:%d data_ii_start:%d data_ii_end:%d data_jj_start:%d data_jj_end:%d" % \ + # (channel_idx, + # batch[batch_index].out_ii_start, batch[batch_index].out_ii_end, + # batch[batch_index].out_jj_start, batch[batch_index].out_jj_end, + # batch[batch_index].data_ii_start, batch[batch_index].data_ii_end, + # batch[batch_index].data_jj_start, batch[batch_index].data_jj_end) if do_info: - print >>info_file, 1 if mt.is_conv else 0, '%.6f' % mt.max_vals[channel_idx, max_idx], - if mt.is_conv: - print >>info_file, '%d %d %d %d' % tuple(mt.max_locs[channel_idx, max_idx]), + print >> batch[batch_index].info_file, 1 if mt.is_spatial else 0, '%.6f' % vals[batch[batch_index].channel_idx, batch[batch_index].max_idx], + if mt.is_spatial: + print >> batch[batch_index].info_file, '%d %d %d %d' % tuple(locs[batch[batch_index].channel_idx, batch[batch_index].max_idx]), else: - print >>info_file, '%d %d' % tuple(mt.max_locs[channel_idx, max_idx]), - print >>info_file, filename + print >> batch[batch_index].info_file, '%d %d' % tuple(locs[batch[batch_index].channel_idx, batch[batch_index].max_idx]), + print >> batch[batch_index].info_file, batch[batch_index].filename if not (do_maxes or do_deconv or do_deconv_norm or do_backprop or do_backprop_norm): continue - with WithTimer('Load image', quiet = not do_print): - im = caffe.io.load_image(os.path.join(datadir, filename)) - with WithTimer('Predict ', quiet = not do_print): - net.predict([im], oversample = False) # Just take center crop, same as in scan_images_for_maxes - if len(net.blobs[layer].data.shape) == 4: - reproduced_val = net.blobs[layer].data[0,channel_idx,ii,jj] - else: - reproduced_val = net.blobs[layer].data[0,channel_idx] - if abs(reproduced_val - recorded_val) > .1: - print 'Warning: recorded value %s is suspiciously different from reproduced value %s. Is the filelist the same?' % (recorded_val, reproduced_val) - - if do_maxes: - #grab image from data layer, not from im (to ensure preprocessing / center crop details match between image and deconv/backprop) - out_arr = np.zeros((3,size_ii,size_jj), dtype='float32') - out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = net.blobs['data'].data[0,:,data_ii_start:data_ii_end,data_jj_start:data_jj_end] - with WithTimer('Save img ', quiet = not do_print): - save_caffe_image(out_arr, os.path.join(unit_dir, 'maxim_%03d.png' % max_idx_0), - autoscale = False, autoscale_center = 0) - - if do_deconv or do_deconv_norm: - diffs = net.blobs[layer].diff * 0 - if len(diffs.shape) == 4: - diffs[0,channel_idx,ii,jj] = 1.0 + if settings.is_siamese: + # in siamese network, filename is a pair of image file names + filename1 = batch[batch_index].filename[0] + filename2 = batch[batch_index].filename[1] + + # load both images + im1 = caffe.io.load_image(os.path.join(datadir, filename1), color=not settings._calculated_is_gray_model) + im2 = caffe.io.load_image(os.path.join(datadir, filename2), color=not settings._calculated_is_gray_model) + + if settings.siamese_input_mode == 'concat_channelwise': + + # resize images according to input dimension + im1 = resize_without_fit(im1, net_input_dims) + im2 = resize_without_fit(im2, net_input_dims) + + # concatenate channelwise + batch[batch_index].im = np.concatenate((im1, im2), axis=2) + + # convert to float to avoid caffe destroying the image in the scaling phase + batch[batch_index].im = batch[batch_index].im.astype(np.float32) + + elif settings.siamese_input_mode == 'concat_along_width': + half_input_dims = (net_input_dims[0], net_input_dims[1] / 2) + im1 = resize_without_fit(im1, half_input_dims) + im2 = resize_without_fit(im2, half_input_dims) + batch[batch_index].im = np.concatenate((im1, im2), axis=1) + + # convert to float to avoid caffe destroying the image in the scaling phase + batch[batch_index].im = batch[batch_index].im.astype(np.float32) + else: - diffs[0,channel_idx] = 1.0 - with WithTimer('Deconv ', quiet = not do_print): - net.deconv_from_layer(layer, diffs) - - out_arr = np.zeros((3,size_ii,size_jj), dtype='float32') - out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = net.blobs['data'].diff[0,:,data_ii_start:data_ii_end,data_jj_start:data_jj_end] - if out_arr.max() == 0: - print 'Warning: Deconv out_arr in range', out_arr.min(), 'to', out_arr.max(), 'ensure force_backward: true in prototxt' - if do_deconv: - with WithTimer('Save img ', quiet = not do_print): - save_caffe_image(out_arr, os.path.join(unit_dir, 'deconv_%03d.png' % max_idx_0), - autoscale = False, autoscale_center = 0) - if do_deconv_norm: - out_arr = np.linalg.norm(out_arr, axis=0) - with WithTimer('Save img ', quiet = not do_print): - save_caffe_image(out_arr, os.path.join(unit_dir, 'deconvnorm_%03d.png' % max_idx_0)) - - if do_backprop or do_backprop_norm: - diffs = net.blobs[layer].diff * 0 - diffs[0,channel_idx,ii,jj] = 1.0 - with WithTimer('Backward ', quiet = not do_print): - net.backward_from_layer(layer, diffs) - - out_arr = np.zeros((3,size_ii,size_jj), dtype='float32') - out_arr[:, out_ii_start:out_ii_end, out_jj_start:out_jj_end] = net.blobs['data'].diff[0,:,data_ii_start:data_ii_end,data_jj_start:data_jj_end] - if out_arr.max() == 0: - print 'Warning: Deconv out_arr in range', out_arr.min(), 'to', out_arr.max(), 'ensure force_backward: true in prototxt' - if do_backprop: - with WithTimer('Save img ', quiet = not do_print): - save_caffe_image(out_arr, os.path.join(unit_dir, 'backprop_%03d.png' % max_idx_0), - autoscale = False, autoscale_center = 0) - if do_backprop_norm: - out_arr = np.linalg.norm(out_arr, axis=0) - with WithTimer('Save img ', quiet = not do_print): - save_caffe_image(out_arr, os.path.join(unit_dir, 'backpropnorm_%03d.png' % max_idx_0)) - - if do_info: - info_file.close() + # load image + batch[batch_index].im = caffe.io.load_image(os.path.join(datadir, batch[batch_index].filename), color=not settings._calculated_is_gray_model) + + # resize images according to input dimension + batch[batch_index].im = resize_without_fit(batch[batch_index].im, net_input_dims) + + # convert to float to avoid caffe destroying the image in the scaling phase + batch[batch_index].im = batch[batch_index].im.astype(np.float32) + + batch_index += 1 + + # if current batch is full + if batch_index == settings.max_tracker_batch_size \ + or ((channel_idx == idx_end - 1) and (max_idx_0 == num_top - 1)): # or last iteration + + with WithTimer('Predict on batch ', quiet = not do_print): + im_batch = [record.im for record in batch] + net.predict(im_batch, oversample = False) + + # go over batch and update statistics + for i in range(0, batch_index): + + # in siamese network, we wish to return from the normalized layer name and selected input index to the + # denormalized layer name, e.g. from "conv1_1" and selected_input_index=1 to "conv1_1_p" + batch[i].denormalized_layer_name = siamese_helper.denormalize_layer_name_for_max_tracker(layer_name, batch[i].selected_input_index) + batch[i].denormalized_top_name = layer_name_to_top_name(net, batch[i].denormalized_layer_name) + batch[i].layer_format = siamese_helper.get_layer_format_by_layer_name(layer_name) + + if len(net.blobs[batch[i].denormalized_top_name].data.shape) == 4: + if settings.is_siamese and batch[i].layer_format == 'siamese_batch_pair': + reproduced_val = net.blobs[batch[i].denormalized_top_name].data[batch[i].selected_input_index, batch[i].channel_idx, batch[i].ii, batch[i].jj] + + else: # normal network, or siamese in siamese_layer_pair format + reproduced_val = net.blobs[batch[i].denormalized_top_name].data[i, batch[i].channel_idx, batch[i].ii, batch[i].jj] + + else: + if settings.is_siamese and batch[i].layer_format == 'siamese_batch_pair': + reproduced_val = net.blobs[batch[i].denormalized_top_name].data[batch[i].selected_input_index, batch[i].channel_idx] + + else: # normal network, or siamese in siamese_layer_pair format + reproduced_val = net.blobs[batch[i].denormalized_top_name].data[i, batch[i].channel_idx] + + if abs(reproduced_val - batch[i].recorded_val) > .1: + print 'Warning: recorded value %s is suspiciously different from reproduced value %s. Is the filelist the same?' % (batch[i].recorded_val, reproduced_val) + + if do_maxes: + #grab image from data layer, not from im (to ensure preprocessing / center crop details match between image and deconv/backprop) + + out_arr = extract_patch_from_image(net.blobs['data'].data[i], net, batch[i].selected_input_index, settings, + batch[i].data_ii_end, batch[i].data_ii_start, batch[i].data_jj_end, batch[i].data_jj_start, + batch[i].out_ii_end, batch[i].out_ii_start, batch[i].out_jj_end, batch[i].out_jj_start, size_ii, size_jj) + + with WithTimer('Save img ', quiet = not do_print): + save_caffe_image(out_arr, batch[i].maxim_filenames[batch[i].max_idx_0], + autoscale = False, autoscale_center = 0) + + if do_deconv or do_deconv_norm: + + # TODO: we can improve performance by doing batch of deconv_from_layer, but only if we group + # together instances which have the same selected_input_index, this can be done by holding two + # separate batches + + for i in range(0, batch_index): + diffs = net.blobs[batch[i].denormalized_top_name].diff * 0 + + if settings.is_siamese and batch[i].layer_format == 'siamese_batch_pair': + if diffs.shape[0] == 2: + if len(diffs.shape) == 4: + diffs[batch[i].selected_input_index, batch[i].channel_idx, batch[i].ii, batch[i].jj] = 1.0 + else: + # note: the following will not crash, since we already checked we have 2 outputs, so selected_input_index is either 0 or 1 + assert batch[i].selected_input_index != -1 + diffs[batch[i].selected_input_index, batch[i].channel_idx] = 1.0 + elif diffs.shape[0] == 1: + if len(diffs.shape) == 4: + diffs[0, batch[i].channel_idx, batch[i].ii, batch[i].jj] = 1.0 + else: + diffs[0, batch[i].channel_idx] = 1.0 + + else: # normal network, or siamese in siamese_layer_pair format + if len(diffs.shape) == 4: + diffs[i, batch[i].channel_idx, batch[i].ii, batch[i].jj] = 1.0 + else: + diffs[i, batch[i].channel_idx] = 1.0 + + with WithTimer('Deconv ', quiet = not do_print): + net.deconv_from_layer(batch[i].denormalized_layer_name, diffs, zero_higher=True, deconv_type='Guided Backprop') + + out_arr = extract_patch_from_image(net.blobs['data'].diff[i], net, batch[i].selected_input_index, settings, + batch[i].data_ii_end, batch[i].data_ii_start, batch[i].data_jj_end, batch[i].data_jj_start, + batch[i].out_ii_end, batch[i].out_ii_start, batch[i].out_jj_end, batch[i].out_jj_start, size_ii, size_jj) + + if out_arr.max() == 0: + print 'Warning: Deconv out_arr in range', out_arr.min(), 'to', out_arr.max(), 'ensure force_backward: true in prototxt' + + if do_deconv: + with WithTimer('Save img ', quiet=not do_print): + save_caffe_image(out_arr, batch[i].deconv_filenames[batch[i].max_idx_0], + autoscale=False, autoscale_center=0) + if do_deconv_norm: + out_arr = np.linalg.norm(out_arr, axis=0) + with WithTimer('Save img ', quiet=not do_print): + save_caffe_image(out_arr, batch[i].deconvnorm_filenames[batch[i].max_idx_0]) + + if do_backprop or do_backprop_norm: + + for i in range(0, batch_index): + diffs = net.blobs[batch[i].denormalized_top_name].diff * 0 + + if len(diffs.shape) == 4: + diffs[i, batch[i].channel_idx, batch[i].ii, batch[i].jj] = 1.0 + else: + diffs[i, batch[i].channel_idx] = 1.0 + + with WithTimer('Backward batch ', quiet = not do_print): + net.backward_from_layer(batch[i].denormalized_layer_name, diffs) + + for i in range(0, batch_index): + + out_arr = extract_patch_from_image(net.blobs['data'].diff[i], net, batch[i].selected_input_index, settings, + batch[i].data_ii_end, batch[i].data_ii_start, batch[i].data_jj_end, batch[i].data_jj_start, + batch[i].out_ii_end, batch[i].out_ii_start, batch[i].out_jj_end, batch[i].out_jj_start, size_ii, size_jj) + + if out_arr.max() == 0: + print 'Warning: Deconv out_arr in range', out_arr.min(), 'to', out_arr.max(), 'ensure force_backward: true in prototxt' + if do_backprop: + with WithTimer('Save img ', quiet = not do_print): + save_caffe_image(out_arr, batch[i].backprop_filenames[batch[i].max_idx_0], + autoscale = False, autoscale_center = 0) + if do_backprop_norm: + out_arr = np.linalg.norm(out_arr, axis=0) + with WithTimer('Save img ', quiet = not do_print): + save_caffe_image(out_arr, batch[i].backpropnorm_filenames[batch[i].max_idx_0]) + + # close info files + for i in range(0, batch_index): + channel_to_info_file[batch[i].channel_idx].ref_count -= 1 + if channel_to_info_file[batch[i].channel_idx].ref_count == 0: + if do_info: + channel_to_info_file[batch[i].channel_idx].info_file.close() + + batch_index = 0 diff --git a/generate_max_input_images.sh b/generate_max_input_images.sh new file mode 100755 index 000000000..c52f51f6b --- /dev/null +++ b/generate_max_input_images.sh @@ -0,0 +1,5 @@ +#!/usr/bin/env bash + +./find_maxes/find_max_acts.py +./find_maxes/crop_max_patches.py + diff --git a/generate_min_and_max_input_images.sh b/generate_min_and_max_input_images.sh new file mode 100755 index 000000000..e5f294f40 --- /dev/null +++ b/generate_min_and_max_input_images.sh @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +./find_maxes/find_max_acts.py --search-min --N=10 +./find_maxes/crop_max_patches.py --search-min --N=10 + + diff --git a/generate_synthesized_images.sh b/generate_synthesized_images.sh new file mode 100755 index 000000000..831e9cbb8 --- /dev/null +++ b/generate_synthesized_images.sh @@ -0,0 +1,4 @@ +#!/usr/bin/env bash + +./optimize_image.py + diff --git a/image_misc.py b/image_misc.py index 6c940d8cc..e21057279 100644 --- a/image_misc.py +++ b/image_misc.py @@ -1,14 +1,70 @@ #! /usr/bin/env python +import io import cv2 import numpy as np + +# note: these horrible lines solves an unknown segmentation fault, presumably related to opencv / skimage version issues +cv2.namedWindow('test') +cv2.destroyWindow('test') + import skimage import skimage.io -from copy import deepcopy +import matplotlib.pyplot as plt + from misc import WithTimer +def fig2data(fig): + """ + @brief Convert a Matplotlib figure to a 3D numpy array with RGB channels and return it + @param fig a matplotlib figure + @return a numpy 3D array of RGB values + """ + + # alternative implementation - which might be slower + # buf = io.BytesIO() + # fig.savefig(buf, format='png') + # buf.seek(0) + # image = caffe_load_image(buf, color=True, as_uint=True) + # buf.close() + #return image + + # Get the RGB buffer from the figure + fig.canvas.draw() + w, h = fig.canvas.get_width_height() + buf = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8) + buf.shape = (w, h, 3) + return buf + + + +def array_histogram(arr, histogram_pane_shape, title, xlabel, ylabel): + fig = plt.figure(figsize=(10, 10), facecolor='white') + ax = fig.add_subplot(111) + + # generate histogram + values = arr.flatten() + hist, bin_edges = np.histogram(values, bins=50) + + width = 0.7 * (bin_edges[1] - bin_edges[0]) + center = (bin_edges[:-1] + bin_edges[1:]) / 2 + ax.bar(center, hist, align='center', width=width, color='g') + + fig.suptitle(title) + ax.xaxis.label.set_text(xlabel) + ax.yaxis.label.set_text(ylabel) + + figure_buffer = fig2data(fig) + + ax.cla() + fig.clf() + plt.close(fig) + + return figure_buffer + + def norm01(arr): arr = arr.copy() arr -= arr.min() @@ -51,28 +107,50 @@ def cv2_read_cap_rgb(cap, saveto = None): frame = frame[:,:,::-1] # Convert native OpenCV BGR -> RGB return frame - -def cv2_read_file_rgb(filename): - '''Reads an image from file. Always returns (x,y,3)''' - im = cv2.imread(filename) + +def gray_to_color(im): + if len(im.shape) == 2: # Upconvert single channel grayscale to color - im = im[:,:,np.newaxis] + im = im[:, :, np.newaxis] if im.shape[2] == 1: - im = np.tile(im, (1,1,3)) - if im.shape[2] > 3: - # Chop off transparency - im = im[:,:,:3] - im = im[:,:,::-1] # Convert native OpenCV BGR -> RGB + im = np.tile(im, (1, 1, 3)) + + return im + + +def cv2_read_file_rgb(filename, as_grayscale = False): + '''Reads an image from file. Returns (x,y,3) or (x,y,1) depending on as_grayscale parameter''' + + if as_grayscale: + im = cv2.imread(filename, cv2.CV_LOAD_IMAGE_GRAYSCALE) + if len(im.shape) == 2: + # Upconvert single channel grayscale to color + im = im[:, :, np.newaxis] + + else: + im = cv2.imread(filename) + im = gray_to_color(im) + if im.shape[2] > 3: + # Chop off transparency + im = im[:,:,:3] + im = im[:,:,::-1] # Convert native OpenCV BGR -> RGB + return im -def read_cam_frame(cap, saveto = None): +def read_cam_frame(cap, saveto = None, color = True): #frame = np.array(cv2_read_cap_rgb(cap, saveto = saveto), dtype='float32') frame = cv2_read_cap_rgb(cap, saveto = saveto) + if not color: + frame = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY) + frame = frame[:, :, np.newaxis] + frame = frame[:,::-1,:] # flip L-R for display frame -= frame.min() - frame = frame * (255.0 / (frame.max() + 1e-6)) + frame = frame * (1.0 / (frame.max() + 1e-6)) + + return frame @@ -277,6 +355,15 @@ def ensure_uint255_and_resize_to_fit(img, out_max_shape, shrink_interpolation = shrink_interpolation, grow_interpolation = grow_interpolation) +def ensure_uint255_and_resize_without_fit(img, out_max_shape, + shrink_interpolation = cv2.INTER_LINEAR, + grow_interpolation = cv2.INTER_NEAREST): + as_uint255 = ensure_uint255(img) + return resize_without_fit(as_uint255, out_max_shape, + dtype_out = 'uint8', + shrink_interpolation = shrink_interpolation, + grow_interpolation = grow_interpolation) + def ensure_uint255(arr): '''If data is float, multiply by 255 and convert to uint8. Else leave as uint8.''' @@ -354,11 +441,76 @@ def resize_to_fit(img, out_max_shape, if convert_early: img = np.array(img, dtype=dtype_out) - out = cv2.resize(img, - (int(img.shape[1] * scale), int(img.shape[0] * scale)), # in (c,r) order - interpolation = grow_interpolation if scale > 1 else shrink_interpolation) + + if len(img.shape) == 3: + out = np.stack([cv2.resize(img[:,:,i], + (int(round(img.shape[1] * scale)), int(round(img.shape[0] * scale))), # in (c,r) order + interpolation = grow_interpolation if scale > 1 else shrink_interpolation) + for i in range(img.shape[2])], axis=2) + else: + out = cv2.resize(img, + (int(round(img.shape[1] * scale)), int(round(img.shape[0] * scale))), # in (c,r) order + interpolation=grow_interpolation if scale > 1 else shrink_interpolation) + if convert_late: out = np.array(out, dtype=dtype_out) + + # fix resize of grayscale images + if len(img.shape) == 3 and img.shape[2] == 1 and len(out.shape) == 2: + out = out[:,:,np.newaxis] + + return out + + +def resize_without_fit(img, out_max_shape, + dtype_out = None, + shrink_interpolation = cv2.INTER_LINEAR, + grow_interpolation = cv2.INTER_NEAREST): + '''Resizes (without fit) to out_max_shape. + + If one of the out_max_shape dimensions is None, then use only the other dimension to perform resizing. + + ''' + + if dtype_out is not None and img.dtype != dtype_out: + dtype_in_size = img.dtype.itemsize + dtype_out_size = np.dtype(dtype_out).itemsize + convert_early = (dtype_out_size < dtype_in_size) + convert_late = not convert_early + else: + convert_early = False + convert_late = False + if out_max_shape[0] is None: + scale_0 = float(out_max_shape[1]) / img.shape[1] + scale_1 = float(out_max_shape[1]) / img.shape[1] + elif out_max_shape[1] is None: + scale_1 = float(out_max_shape[0]) / img.shape[0] + scale_0 = float(out_max_shape[0]) / img.shape[0] + else: + + scale_0 = float(out_max_shape[0]) / img.shape[0] + scale_1 = float(out_max_shape[1]) / img.shape[1] + + if convert_early: + img = np.array(img, dtype=dtype_out) + + if len(img.shape) == 3: + out = np.stack([cv2.resize(img[:,:,i], # 0,0), fx=scale_1, fy=scale_0, + (int(round(img.shape[1] * scale_1)), int(round(img.shape[0] * scale_0))), # in (c,r) order + interpolation=grow_interpolation if min(scale_0, scale_1) > 1 else shrink_interpolation) + for i in range(img.shape[2])], axis=2) + else: + out = cv2.resize(img, # 0,0), fx=scale_1, fy=scale_0, + (int(round(img.shape[1] * scale_1)), int(round(img.shape[0] * scale_0))), # in (c,r) order + interpolation=grow_interpolation if min(scale_0, scale_1) > 1 else shrink_interpolation) + + if convert_late: + out = np.array(out, dtype=dtype_out) + + # fix resize of grayscale images + if len(img.shape) == 3 and img.shape[2] == 1 and len(out.shape) == 2: + out = out[:, :, np.newaxis] + return out @@ -387,6 +539,7 @@ def cv2_typeset_text(data, lines, loc, between = ' ', string_spacing = 0, line_s Returns: locy: new y location = loc[1] + y-offset resulting from lines of text + boxes: list of boxes, one for each line, in the format (start_x, end_x, start_y, end_y) ''' data_width = data.shape[1] @@ -398,19 +551,22 @@ def cv2_typeset_text(data, lines, loc, between = ' ', string_spacing = 0, line_s lines = [lines] assert isinstance(lines, list), 'lines must be a list of lines or list of FormattedString objects or a single FormattedString object' if len(lines) == 0: - return loc[1] + return loc[1], [] if not isinstance(lines[0], list): # If a single line of text is given as a list of strings, convert to multiline format lines = [lines] locy = loc[1] + boxes = list() + line_num = 0 while line_num < len(lines): line = lines[line_num] maxy = 0 locx = loc[0] for ii,fs in enumerate(line): + text = fs.string last_on_line = (ii == len(line) - 1) if not last_on_line: fs.string += between @@ -434,7 +590,8 @@ def cv2_typeset_text(data, lines, loc, between = ' ', string_spacing = 0, line_s lines.insert(line_num+1, new_next_line) break ###line_num += 1 - ###continue + ###continue + boxes.append((locx, locx + boxsize[0], locy - boxsize[1], locy, text)) cv2.putText(data, fs.string, (locx,locy), fs.face, fs.fsize, fs.clr, fs.thick) maxy = max(maxy, boxsize[1]) if fs.width is not None: @@ -450,8 +607,7 @@ def cv2_typeset_text(data, lines, loc, between = ' ', string_spacing = 0, line_s line_num += 1 locy += maxy + line_spacing - return locy - + return locy, boxes def saveimage(filename, im): @@ -464,11 +620,17 @@ def saveimage(filename, im): cv2.imwrite(filename, 255*im) - def saveimagesc(filename, im): saveimage(filename, norm01(im)) - def saveimagescc(filename, im, center): saveimage(filename, norm01c(im, center)) + + +def gray_to_colormap(map_name, gray_image): + + cmap = plt.get_cmap(map_name) + rgba_image = cmap(gray_image) + rgb_image = np.delete(rgba_image, 3, 2) + return rgb_image diff --git a/input_fetcher.py b/input_fetcher.py index 95eb7a9fd..d489f8449 100644 --- a/input_fetcher.py +++ b/input_fetcher.py @@ -6,9 +6,10 @@ import numpy as np from codependent_thread import CodependentThread -from image_misc import cv2_imshow_rgb, cv2_read_file_rgb, read_cam_frame, crop_to_square -from misc import tsplit +from image_misc import cv2_imshow_rgb, read_cam_frame, crop_to_square +from misc import tsplit, get_files_list +import caffe class InputImageFetcher(CodependentThread): '''Fetches images from a webcam or loads from a directory.''' @@ -45,12 +46,17 @@ def __init__(self, settings): # latest loaded image frame, holds the pixels and used to force reloading self.latest_static_frame = None + # latest label for loaded image + self.latest_label = None + # keeps current index of loaded file, doesn't seem important self.static_file_idx = None # contains the requested number of increaments for file index self.static_file_idx_increment = 0 - + + self.available_files, self.labels = get_files_list(self.settings) + def bind_camera(self): # Due to OpenCV limitations, this should be called from the main thread print 'InputImageFetcher: bind_camera starting' @@ -100,6 +106,7 @@ def set_mode_stretch_on(self): if not self.static_file_stretch_mode: self.static_file_stretch_mode = True self.latest_static_frame = None # Force reload + self.latest_label = None #self.latest_frame_is_from_cam = True # Force reload def set_mode_stretch_off(self): @@ -107,6 +114,7 @@ def set_mode_stretch_off(self): if self.static_file_stretch_mode: self.static_file_stretch_mode = False self.latest_static_frame = None # Force reload + self.latest_label = None #self.latest_frame_is_from_cam = True # Force reload def toggle_stretch_mode(self): @@ -125,14 +133,26 @@ def run(self): if self.freeze_cam and self.latest_cam_frame is not None: # If static file mode was switched to cam mode but cam is still frozen, we need to push the cam frame again if not self.latest_frame_is_from_cam: - self._increment_and_set_frame(self.latest_cam_frame, True) + + # future feature: implement more interesting combination of using a camera in sieamese mode + if self.settings.is_siamese: + im = (self.latest_cam_frame, self.latest_cam_frame) + else: + im = self.latest_cam_frame + + self._increment_and_set_frame(im, True) else: - frame_full = read_cam_frame(self.bound_cap_device) + frame_full = read_cam_frame(self.bound_cap_device, color=not self.settings._calculated_is_gray_model) #print '====> just read frame', frame_full.shape frame = crop_to_square(frame_full) with self.lock: self.latest_cam_frame = frame - self._increment_and_set_frame(self.latest_cam_frame, True) + + if self.settings.is_siamese: + im = (self.latest_cam_frame, self.latest_cam_frame) + else: + im = self.latest_cam_frame + self._increment_and_set_frame(im, True) time.sleep(self.sleep_after_read_frame) #print 'Reading one frame took', time.time() - start_time @@ -146,12 +166,24 @@ def get_frame(self): is not valid. ''' with self.lock: - return (self.latest_frame_idx, self.latest_frame_data) + return (self.latest_frame_idx, self.latest_frame_data, self.latest_label, self.latest_static_filename) def increment_static_file_idx(self, amount = 1): with self.lock: self.static_file_idx_increment += amount + def next_image(self): + if self.static_file_mode: + self.increment_static_file_idx(1) + else: + self.static_file_mode = True + + def prev_image(self): + if self.static_file_mode: + self.increment_static_file_idx(-1) + else: + self.static_file_mode = True + def _increment_and_set_frame(self, frame, from_cam): assert frame is not None with self.lock: @@ -159,42 +191,6 @@ def _increment_and_set_frame(self, frame, from_cam): self.latest_frame_data = frame self.latest_frame_is_from_cam = from_cam - def get_files_from_directory(self): - # returns list of files in requested directory - - available_files = [] - match_flags = re.IGNORECASE if self.settings.static_files_ignore_case else 0 - for filename in os.listdir(self.settings.static_files_dir): - if re.match(self.settings.static_files_regexp, filename, match_flags): - available_files.append(filename) - - return available_files - - def get_files_from_image_list(self): - # returns list of files in requested image list file - - available_files = [] - - with open(self.settings.static_files_input_file, 'r') as image_list_file: - lines = image_list_file.readlines() - # take first token from each line - available_files = [tsplit(line, True,' ',',','\t')[0] for line in lines if line.strip() != ""] - - return available_files - - def get_files_from_siamese_image_list(self): - # returns list of pair files in requested siamese image list file - - available_files = [] - - with open(self.settings.static_files_input_file, 'r') as image_list_file: - lines = image_list_file.readlines() - # take first and second tokens from each line - available_files = [(tsplit(line, True, ' ', ',','\t')[0], tsplit(line, True, ' ', ',','\t')[1]) - for line in lines if line.strip() != ""] - - return available_files - def check_increment_and_load_image(self): with self.lock: if (self.static_file_idx_increment == 0 and @@ -204,43 +200,40 @@ def check_increment_and_load_image(self): # Skip if a static frame is already loaded and there is no increment return - # available_files - local list of files - if self.settings.static_files_input_mode == "directory": - available_files = self.get_files_from_directory() - elif self.settings.static_files_input_mode == "image_list": - available_files = self.get_files_from_image_list() - elif self.settings.static_files_input_mode == "siamese_image_list": - available_files = self.get_files_from_siamese_image_list() - else: - raise Exception(('Error: setting static_files_input_mode has invalid option (%s)' % - (self.settings.static_files_input_mode) )) - - #print 'Found files:' - #for filename in available_files: - # print ' %s' % filename - assert len(available_files) != 0, ('Error: No files found in %s matching %s (current working directory is %s)' % + assert len(self.available_files) != 0, ('Error: No files found in %s matching %s (current working directory is %s)' % (self.settings.static_files_dir, self.settings.static_files_regexp, os.getcwd())) if self.static_file_idx is None: self.static_file_idx = 0 - self.static_file_idx = (self.static_file_idx + self.static_file_idx_increment) % len(available_files) + self.static_file_idx = (self.static_file_idx + self.static_file_idx_increment) % len(self.available_files) self.static_file_idx_increment = 0 - if self.latest_static_filename != available_files[self.static_file_idx] or self.latest_static_frame is None: - self.latest_static_filename = available_files[self.static_file_idx] - - if self.settings.static_files_input_mode == "siamese_image_list": - # loading two images for siamese network - im1 = cv2_read_file_rgb(os.path.join(self.settings.static_files_dir, self.latest_static_filename[0])) - im2 = cv2_read_file_rgb(os.path.join(self.settings.static_files_dir, self.latest_static_filename[1])) - if not self.static_file_stretch_mode: - im1 = crop_to_square(im1) - im2 = crop_to_square(im2) - - im = (im1,im2) - - else: - im = cv2_read_file_rgb(os.path.join(self.settings.static_files_dir, self.latest_static_filename)) - if not self.static_file_stretch_mode: - im = crop_to_square(im) + if self.latest_static_filename != self.available_files[self.static_file_idx] or self.latest_static_frame is None: + self.latest_static_filename = self.available_files[self.static_file_idx] + + failed = False + try: + if self.settings.is_siamese: + # loading two images for siamese network + im1 = caffe.io.load_image(os.path.join(self.settings.static_files_dir, self.latest_static_filename[0]), color=not self.settings._calculated_is_gray_model) + im2 = caffe.io.load_image(os.path.join(self.settings.static_files_dir, self.latest_static_filename[1]), color=not self.settings._calculated_is_gray_model) + if not self.static_file_stretch_mode: + im1 = crop_to_square(im1) + im2 = crop_to_square(im2) + + im = (im1,im2) + + else: + im = caffe.io.load_image(os.path.join(self.settings.static_files_dir, self.latest_static_filename), color=not self.settings._calculated_is_gray_model) + if not self.static_file_stretch_mode: + im = crop_to_square(im) + except Exception as e: + failed = True + print 'Failed loading data' + + if not failed: + self.latest_static_frame = im + + # if we have labels, keep it + if self.labels: + self.latest_label = self.labels[self.static_file_idx] - self.latest_static_frame = im self._increment_and_set_frame(self.latest_static_frame, False) diff --git a/live_vis.py b/live_vis.py index 9dde3be99..03eec9211 100644 --- a/live_vis.py +++ b/live_vis.py @@ -15,9 +15,8 @@ raise from misc import WithTimer -from image_misc import cv2_imshow_rgb, FormattedString, cv2_typeset_text, to_255 +from image_misc import cv2_imshow_rgb, FormattedString, cv2_typeset_text, to_255, gray_to_color, ensure_uint255_and_resize_without_fit from bindings import bindings -from input_fetcher import InputImageFetcher pane_debug_clr = (255, 64, 64) @@ -25,20 +24,20 @@ class ImproperlyConfigured(Exception): pass - - class Pane(object): '''Hold info about one window pane (rectangular region within the main window)''' def __init__(self, i_begin, j_begin, i_size, j_size): + self.reset(i_begin, j_begin, i_size, j_size) + + def reset(self, i_begin, j_begin, i_size, j_size): self.i_begin = i_begin self.j_begin = j_begin self.i_size = i_size self.j_size = j_size self.i_end = i_begin + i_size self.j_end = j_begin + j_size - self.data = None # eventually contains a slice of the window buffer - + self.data = None # eventually contains a slice of the window buffer class LiveVis(object): @@ -62,7 +61,7 @@ def __init__(self, settings): app = app_class(settings, self.bindings) self.apps[app_name] = app self.help_mode = False - self.window_name = 'Deep Visualization Toolbox' + self.window_name = 'Deep Visualization Toolbox | Model: %s' % (settings.model_to_load) self.quit = False self.debug_level = 0 @@ -115,11 +114,53 @@ def init_window(self): self.help_buffer = self.window_buffer.copy() # For rendering help mode self.help_pane.data = self.help_buffer[self.help_pane.i_begin:self.help_pane.i_end, self.help_pane.j_begin:self.help_pane.j_end] + # add listener for mouse clicks + cv2.setMouseCallback(self.window_name, self.on_mouse_click) + + def on_mouse_click(self, event, x, y, flags, param): + ''' + Handle all button presses. + ''' + + if event == cv2.EVENT_LBUTTONUP: + for app_name, app in self.apps.iteritems(): + with WithTimer('%s:on_mouse_click' % app_name, quiet=self.debug_level < 1): + key = app.handle_mouse_left_click(x, y, flags, param, self.panes) + + def check_for_control_height_update(self): + + if hasattr(self.settings, '_calculated_control_pane_height') and \ + self.settings._calculated_control_pane_height != self.panes['caffevis_control'].i_size: + + self.panes['caffevis_control'].reset( + self.settings.window_panes[4][1][0], + self.settings.window_panes[4][1][1], + self.settings._calculated_control_pane_height, + self.settings.window_panes[4][1][3]) + + self.panes['caffevis_layers'].reset( + self.settings._calculated_control_pane_height, + self.settings.window_panes[5][1][1], + self.settings.window_panes[5][1][2] + 3*20 - self.settings._calculated_control_pane_height, + self.settings.window_panes[5][1][3]) + + for _, pane in self.panes.iteritems(): + pane.data = self.window_buffer[pane.i_begin:pane.i_end, pane.j_begin:pane.j_end] + + return True + + else: + return False + + pass + def run_loop(self): self.quit = False # Setup self.init_window() #cap = cv2.VideoCapture(self.settings.capture_device) + from input_fetcher import InputImageFetcher + self.input_updater = InputImageFetcher(self.settings) self.input_updater.bind_camera() self.input_updater.start() @@ -127,7 +168,7 @@ def run_loop(self): heartbeat_functions = [self.input_updater.heartbeat] for app_name, app in self.apps.iteritems(): print 'Starting app:', app_name - app.start() + app.start(self) heartbeat_functions.extend(app.get_heartbeats()) ii = 0 @@ -187,12 +228,16 @@ def run_loop(self): for app_name, app in self.apps.iteritems(): redraw_needed |= app.redraw_needed() + redraw_needed |= self.check_for_control_height_update() + # Grab latest frame from input_updater thread - fr_idx,fr_data = self.input_updater.get_frame() + fr_idx,fr_data,fr_label,fr_filename = self.input_updater.get_frame() is_new_frame = (fr_idx != latest_frame_idx and fr_data is not None) if is_new_frame: latest_frame_idx = fr_idx latest_frame_data = fr_data + latest_label = fr_label + latest_filename = fr_filename frame_for_apps = fr_data if is_new_frame: @@ -206,7 +251,7 @@ def run_loop(self): # Pass frame to apps for processing for app_name, app in self.apps.iteritems(): with WithTimer('%s:handle_input' % app_name, quiet = self.debug_level < 1): - app.handle_input(latest_frame_data, self.panes) + app.handle_input(latest_frame_data, latest_label, latest_filename, self.panes) frame_for_apps = None # Tell each app to draw @@ -276,17 +321,11 @@ def handle_key_pre_apps(self, key): elif tag == 'toggle_input_mode': self.input_updater.toggle_input_mode() elif tag == 'static_file_increment': - if self.input_updater.static_file_mode: - self.input_updater.increment_static_file_idx(1) - else: - self.input_updater.static_file_mode = True + self.input_updater.next_image() elif tag == 'static_file_decrement': - if self.input_updater.static_file_mode: - self.input_updater.increment_static_file_idx(-1) - else: - self.input_updater.static_file_mode = True + self.input_updater.prev_image() elif tag == 'help_mode': - self.help_mode = not self.help_mode + self.toggle_help_mode() elif tag == 'stretch_mode': self.input_updater.toggle_stretch_mode() print 'Stretch mode is now', self.input_updater.static_file_stretch_mode @@ -301,7 +340,7 @@ def handle_key_pre_apps(self, key): def handle_key_post_apps(self, key): tag = self.bindings.get_tag(key) if tag == 'quit': - self.quit = True + self.set_quit_flag() elif key == None: pass else: @@ -316,17 +355,21 @@ def handle_key_post_apps(self, key): key, hex(key), key_label, tag, tag) def display_frame(self, frame): - if self.settings.static_files_input_mode == "siamese_image_list": + + full_pane_shape = self.panes['input'].data.shape[:2][::-1] + if self.settings.is_siamese and ((type(frame),len(frame)) == (tuple,2)): frame1 = frame[0] frame2 = frame[1] - full_pane_shape = self.panes['input'].data.shape[:2][::-1] - half_pane_shape = (full_pane_shape[0] / 2, full_pane_shape[1]) - frame_disp1 = cv2.resize(frame1[:], half_pane_shape) - frame_disp2 = cv2.resize(frame2[:], half_pane_shape) - frame_disp = np.concatenate((frame_disp1, frame_disp2), axis=1) + half_pane_shape = (full_pane_shape[0], full_pane_shape[1]/2) + frame_disp1 = ensure_uint255_and_resize_without_fit(frame1[:], half_pane_shape) + frame_disp2 = ensure_uint255_and_resize_without_fit(frame2[:], half_pane_shape) + frame_disp = np.concatenate((frame_disp1, frame_disp2), axis=1) else: - frame_disp = cv2.resize(frame[:], self.panes['input'].data.shape[:2][::-1]) + frame_disp = ensure_uint255_and_resize_without_fit(frame[:], full_pane_shape) + + if self.settings._calculated_is_gray_model: + frame_disp = gray_to_color(frame_disp) self.panes['input'].data[:] = frame_disp @@ -338,22 +381,18 @@ def draw_help(self): defaults = self.help_pane_defaults lines = [] lines.append([FormattedString('~ ~ ~ Deep Visualization Toolbox ~ ~ ~', defaults, align='center', width=self.help_pane.j_size)]) - lines.append([FormattedString('', defaults)]) - lines.append([FormattedString('Base keys', defaults)]) - - for tag in ('help_mode', 'freeze_cam', 'toggle_input_mode', 'static_file_increment', 'static_file_decrement', 'stretch_mode', 'quit'): - key_strings, help_string = self.bindings.get_key_help(tag) - label = '%10s:' % (','.join(key_strings)) - lines.append([FormattedString(label, defaults, width=120, align='right'), - FormattedString(help_string, defaults)]) - locy = cv2_typeset_text(self.help_pane.data, lines, loc, - line_spacing = self.settings.help_line_spacing) + locy, boxes = cv2_typeset_text(self.help_pane.data, lines, loc, + line_spacing = self.settings.help_line_spacing) for app_name, app in self.apps.iteritems(): locy = app.draw_help(self.help_pane, locy) + def toggle_help_mode(self): + self.help_mode = not self.help_mode + def set_quit_flag(self): + self.quit = True if __name__ == '__main__': print 'You probably want to run ./run_toolbox.py instead.' diff --git a/misc.py b/misc.py index a9f43d5c6..eb60869f3 100644 --- a/misc.py +++ b/misc.py @@ -59,4 +59,68 @@ def tsplit(string, no_empty_strings, *delimiters): if no_empty_strings: strings = filter(None, strings) - return strings \ No newline at end of file + return strings + + +def get_files_from_directory(settings): + # returns list of files in requested directory + + available_files = [] + match_flags = re.IGNORECASE if settings.static_files_ignore_case else 0 + for filename in os.listdir(settings.static_files_dir): + if re.match(settings.static_files_regexp, filename, match_flags): + available_files.append(filename) + + return available_files + + +def get_files_from_image_list(settings): + # returns list of files in requested image list file + + available_files = [] + labels = [] + + with open(settings.static_files_input_file, 'r') as image_list_file: + lines = image_list_file.readlines() + # take first token from each line + available_files = [tsplit(line, True, ' ', ',', '\t')[0] for line in lines if line.strip() != ""] + labels = [tsplit(line, True, ' ', ',', '\t')[1].strip() for line in lines if line.strip() != ""] + + return available_files, labels + + +def get_files_from_siamese_image_list(settings): + # returns list of pair files in requested siamese image list file + + available_files = [] + labels = [] + + with open(settings.static_files_input_file, 'r') as image_list_file: + lines = image_list_file.readlines() + # take first and second tokens from each line + available_files = [(tsplit(line, True, ' ', ',', '\t')[0], tsplit(line, True, ' ', ',', '\t')[1]) + for line in lines if line.strip() != ""] + labels = [tsplit(line, True, ' ', ',', '\t')[2].strip() for line in lines if line.strip() != ""] + + return available_files, labels + + +def get_files_list(settings): + + print 'Getting image list...' + + # available_files - local list of files + if settings.static_files_input_mode == "directory": + available_files = get_files_from_directory(settings) + labels = None + elif (settings.static_files_input_mode == "image_list") and (not settings.is_siamese): + available_files, labels = get_files_from_image_list(settings) + elif (settings.static_files_input_mode == "image_list") and (settings.is_siamese): + available_files, labels = get_files_from_siamese_image_list(settings) + else: + raise Exception(('Error: setting static_files_input_mode has invalid option (%s)' % + (settings.static_files_input_mode))) + + print 'Getting image list... Done.' + + return available_files, labels diff --git a/model_settings/__init__.py b/model_settings/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/model_settings/settings_bvlc_googlenet.py b/model_settings/settings_bvlc_googlenet.py new file mode 100644 index 000000000..5000578c8 --- /dev/null +++ b/model_settings/settings_bvlc_googlenet.py @@ -0,0 +1,43 @@ + +# basic network configuration +base_folder = '%DVT_ROOT%/' +caffevis_deploy_prototxt = base_folder + './models/bvlc-googlenet/bvlc-googlenet-deploy.prototxt' +caffevis_network_weights = base_folder + './models/bvlc-googlenet/bvlc_googlenet.caffemodel' +caffevis_data_mean = (104, 117, 123) + +# input images +static_files_dir = base_folder + './input_images/' + +# UI customization +caffevis_labels = base_folder + './models/bvlc-googlenet/ilsvrc_2012_labels.txt' +caffevis_prob_layer = 'prob' +caffevis_label_layers = ['loss3/classifier', 'prob'] + +layers_list = [] +layers_list.append({'name/s': 'conv1/7x7_s2', 'format': 'normal'}) +layers_list.append({'name/s': 'conv2/3x3_reduce', 'format': 'normal'}) +layers_list.append({'name/s': 'conv2/3x3', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_3a/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_3b/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_4a/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_4b/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_4c/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_4d/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_4e/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_5a/output', 'format': 'normal'}) +layers_list.append({'name/s': 'inception_5b/output', 'format': 'normal'}) +layers_list.append({'name/s': 'prob', 'format': 'normal'}) + +def caffevis_layer_pretty_name_fn(name): + # Shorten many layer names to fit in control pane (full layer name visible in status bar) + name = name.replace('conv','c').replace('pool','p').replace('norm','n') + name = name.replace('inception_','i').replace('output','o').replace('reduce','r').replace('split_','s') + name = name.replace('__','_').replace('__','_') + return name + +# offline scripts configuration +caffevis_outputs_dir = base_folder + './models/bvlc-googlenet/outputs' +layers_to_output_in_offline_scripts = ['conv1/7x7_s2', 'conv2/3x3_reduce', 'conv2/3x3', 'inception_3a/output', + 'inception_3b/output', 'inception_4a/output', 'inception_4b/output', + 'inception_4c/output', 'inception_4d/output', 'inception_4e/output', + 'inception_5a/output', 'inception_5b/output', 'prob'] diff --git a/model_settings/settings_caffenet_yos.py b/model_settings/settings_caffenet_yos.py new file mode 100644 index 000000000..ad113377d --- /dev/null +++ b/model_settings/settings_caffenet_yos.py @@ -0,0 +1,22 @@ + +# basic network configuration +base_folder = '%DVT_ROOT%/' +caffevis_deploy_prototxt = base_folder + './models/caffenet-yos/caffenet-yos-deploy.prototxt' +caffevis_network_weights = base_folder + './models/caffenet-yos/caffenet-yos-weights' +caffevis_data_mean = base_folder + './models/caffenet-yos/ilsvrc_2012_mean.npy' + +# input images +static_files_dir = base_folder + './input_images/' + +# UI customization +caffevis_label_layers = ['fc8', 'prob'] +caffevis_labels = base_folder + './models/caffenet-yos/ilsvrc_2012_labels.txt' +caffevis_prob_layer = 'prob' + +def caffevis_layer_pretty_name_fn(name): + return name.replace('pool','p').replace('norm','n') + +# offline scripts configuration +# caffevis_outputs_dir = base_folder + './models/caffenet-yos/unit_jpg_vis' +caffevis_outputs_dir = base_folder + './models/caffenet-yos/outputs' +layers_to_output_in_offline_scripts = ['conv1', 'conv2', 'conv3', 'conv4', 'conv5', 'fc6', 'fc7', 'fc8', 'prob'] diff --git a/model_settings/settings_squeezenet.py b/model_settings/settings_squeezenet.py new file mode 100644 index 000000000..6318f620d --- /dev/null +++ b/model_settings/settings_squeezenet.py @@ -0,0 +1,25 @@ + +# basic network configuration +base_folder = '%DVT_ROOT%/' +caffevis_deploy_prototxt = base_folder + './models/squeezenet/deploy.prototxt' +caffevis_network_weights = base_folder + './models/squeezenet/squeezenet_v1.0.caffemodel' +caffevis_data_mean = (104, 117, 123) + +# input images +static_files_dir = base_folder + './input_images/' + +# UI customization +caffevis_labels = base_folder + './models/squeezenet/ilsvrc_2012_labels.txt' +caffevis_prob_layer = 'prob' +caffevis_label_layers = ['conv10', 'relu_conv10', 'pool10', 'prob'] + +def caffevis_layer_pretty_name_fn(name): + name = name.replace('fire','f').replace('relu_expand','re').replace('expand','e').replace('concat','c').replace('squeeze','s') + name = name.replace('1x1_','').replace('1x1','') + return name + +# Don't display duplicate *_split_* layers +caffevis_filter_layers = lambda name: '_split_' in name + +# offline scripts configuration +caffevis_outputs_dir = base_folder + './models/squeezenet/outputs' diff --git a/models/.gitignore b/models/.gitignore index be7186603..10e685655 100644 --- a/models/.gitignore +++ b/models/.gitignore @@ -1 +1,4 @@ *.caffemodel +*.processed_by_deepvis +receptive_fields_cache.pickled + diff --git a/models/bvlc-googlenet/settings_local.template-bvlc-googlenet.py b/models/bvlc-googlenet/settings_local.template-bvlc-googlenet.py deleted file mode 100644 index 9df2a34f2..000000000 --- a/models/bvlc-googlenet/settings_local.template-bvlc-googlenet.py +++ /dev/null @@ -1,54 +0,0 @@ -# Define critical settings and/or override defaults specified in -# settings.py. Copy this file to settings_local.py in the same -# directory as settings.py and edit. Any settings defined here -# will override those defined in settings.py - - - -# Set this to point to your compiled checkout of caffe -caffevis_caffe_root = '/path/to/caffe' - -# Load model: bvlc-googlenet -# Path to caffe deploy prototxt file. Minibatch size should be 1. -caffevis_deploy_prototxt = '%DVT_ROOT%/models/bvlc-googlenet/bvlc-googlenet-deploy.prototxt' - -# Path to network weights to load. -caffevis_network_weights = '%DVT_ROOT%/models/bvlc-googlenet/bvlc-googlenet.caffemodel' - -# Other optional settings; see complete documentation for each in settings.py. -caffevis_data_mean = (104, 117, 123) # per-channel mean -caffevis_labels = '%DVT_ROOT%/models/bvlc-googlenet/ilsvrc_2012_labels.txt' -caffevis_jpgvis_layers = [] -caffevis_prob_layer = 'prob' -caffevis_label_layers = ('loss3/classifier', 'prob') -def caffevis_layer_pretty_name_fn(name): - # Shorten many layer names to fit in control pane (full layer name visible in status bar) - name = name.replace('conv','c').replace('pool','p').replace('norm','n') - name = name.replace('inception_','i').replace('output','o').replace('reduce','r').replace('split_','s') - name = name.replace('__','_').replace('__','_') - return name -# Don't display duplicate *_split_* layers -caffevis_filter_layers = lambda name: '_split_' in name - -# Window panes for bvlc-googlenet (no caffevis_jpgvis pane, larger control pane to fit many layer names). -_control_height = 125 -window_panes = ( - # (i, j, i_size, j_size) - ('input', ( 0, 0, 300, 300)), - ('caffevis_aux', (300, 0, 300, 300)), - ('caffevis_back', (600, 0, 300, 300)), - ('caffevis_status', (900, 0, 30, 1500)), - ('caffevis_control', ( 0, 300, _control_height, 1200)), - ('caffevis_layers', ( _control_height, 300, 900 - _control_height, 1200)), -) -caffevis_layers_aspect_ratio = float(window_panes[-1][1][3])/window_panes[-1][1][2] # Actual ratio from caffevis_layers -caffevis_control_fsize = .85 - -# Use GPU? Default is True. -#caffevis_mode_gpu = True -# Display tweaks. -# Scale all window panes in UI by this factor -#global_scale = 1.0 -# Scale all fonts by this factor -#global_font_size = 1.0 - diff --git a/models/caffenet-yos/settings_local.template-caffenet-yos.py b/models/caffenet-yos/settings_local.template-caffenet-yos.py deleted file mode 100644 index 9a240bf5d..000000000 --- a/models/caffenet-yos/settings_local.template-caffenet-yos.py +++ /dev/null @@ -1,35 +0,0 @@ -# Define critical settings and/or override defaults specified in -# settings.py. Copy this file to settings_local.py in the same -# directory as settings.py and edit. Any settings defined here -# will override those defined in settings.py - - - -# Set this to point to your compiled checkout of caffe -caffevis_caffe_root = '/path/to/caffe' - -# Load model: caffenet-yos -# Path to caffe deploy prototxt file. Minibatch size should be 1. -caffevis_deploy_prototxt = '%DVT_ROOT%/models/caffenet-yos/caffenet-yos-deploy.prototxt' - -# Path to network weights to load. -caffevis_network_weights = '%DVT_ROOT%/models/caffenet-yos/caffenet-yos-weights' - -# Other optional settings; see complete documentation for each in settings.py. -caffevis_data_mean = '%DVT_ROOT%/models/caffenet-yos/ilsvrc_2012_mean.npy' -caffevis_labels = '%DVT_ROOT%/models/caffenet-yos/ilsvrc_2012_labels.txt' -caffevis_label_layers = ('fc8', 'prob') -caffevis_prob_layer = 'prob' -caffevis_unit_jpg_dir = '%DVT_ROOT%/models/caffenet-yos/unit_jpg_vis' -caffevis_jpgvis_layers = ['conv1', 'conv2', 'conv3', 'conv4', 'conv5', 'fc6', 'fc7', 'fc8', 'prob'] -caffevis_jpgvis_remap = {'pool1': 'conv1', 'pool2': 'conv2', 'pool5': 'conv5'} -def caffevis_layer_pretty_name_fn(name): - return name.replace('pool','p').replace('norm','n') - -# Use GPU? Default is True. -#caffevis_mode_gpu = True -# Display tweaks. -# Scale all window panes in UI by this factor -#global_scale = 1.0 -# Scale all fonts by this factor -#global_font_size = 1.0 diff --git a/models/squeezenet/settings_local.template-squeezenet.py b/models/squeezenet/settings_local.template-squeezenet.py deleted file mode 100644 index 1d320c15f..000000000 --- a/models/squeezenet/settings_local.template-squeezenet.py +++ /dev/null @@ -1,56 +0,0 @@ -# Define critical settings and/or override defaults specified in -# settings.py. Copy this file to settings_local.py in the same -# directory as settings.py and edit. Any settings defined here -# will override those defined in settings.py - - - -# Set this to point to your compiled checkout of caffe -caffevis_caffe_root = '/path/to/caffe' - -# Load model: squeezenet -# Path to caffe deploy prototxt file. Minibatch size should be 1. -caffevis_deploy_prototxt = '%DVT_ROOT%/models/squeezenet/deploy.prototxt' - -# Path to network weights to load. -caffevis_network_weights = '%DVT_ROOT%/models/squeezenet/squeezenet_v1.0.caffemodel' - - - -# Other optional settings; see complete documentation for each in settings.py. -caffevis_data_mean = (104, 117, 123) # per-channel mean -caffevis_labels = '%DVT_ROOT%/models/squeezenet/ilsvrc_2012_labels.txt' -caffevis_jpgvis_layers = [] -caffevis_prob_layer = 'prob' -caffevis_label_layers = ('conv10', 'relu_conv10', 'pool10', 'prob') -def caffevis_layer_pretty_name_fn(name): - name = name.replace('fire','f').replace('relu_expand','re').replace('expand','e').replace('concat','c').replace('squeeze','s') - name = name.replace('1x1_','').replace('1x1','') - return name -# Don't display duplicate *_split_* layers -caffevis_filter_layers = lambda name: '_split_' in name - -# Window panes for squeezenet (no caffevis_jpgvis pane, larger control pane to fit many layer names). -_control_height = 45 -window_panes = ( - # (i, j, i_size, j_size) - ('input', ( 0, 0, 300, 300)), - ('caffevis_aux', (300, 0, 300, 300)), - ('caffevis_back', (600, 0, 300, 300)), - ('caffevis_status', (900, 0, 30, 1500)), - ('caffevis_control', ( 0, 300, _control_height, 1200)), - ('caffevis_layers', ( _control_height, 300, 900 - _control_height, 1200)), -) -caffevis_layers_aspect_ratio = float(window_panes[-1][1][3])/window_panes[-1][1][2] # Actual ratio from caffevis_layers - -# Use GPU? Default is True. -#caffevis_mode_gpu = True -# Display tweaks. -# Scale all window panes in UI by this factor -#global_scale = 1.0 -# Scale all fonts by this factor -#global_font_size = 1.0 - -# Wider spacing -caffevis_control_line_spacing = 10 - diff --git a/optimize/gradient_optimizer.py b/optimize/gradient_optimizer.py index e170a12cf..efe3247aa 100755 --- a/optimize/gradient_optimizer.py +++ b/optimize/gradient_optimizer.py @@ -3,6 +3,7 @@ import os import errno import pickle +import datetime import StringIO from pylab import * from scipy.ndimage.filters import gaussian_filter @@ -13,6 +14,10 @@ from misc import mkdir_p, combine_dicts from image_misc import saveimagesc, saveimagescc +from caffe_misc import RegionComputer, get_max_data_extent, compute_data_layer_focus_area, extract_patch_from_image, \ + layer_name_to_top_name + +from siamese_helper import SiameseHelper class FindParams(object): @@ -21,6 +26,7 @@ def __init__(self, **kwargs): # Starting rand_seed = 0, start_at = 'mean_plus_rand', + batch_size = 9, # Optimization push_layer = 'prob', @@ -34,6 +40,7 @@ def __init__(self, **kwargs): small_norm_percentile = None, px_benefit_percentile = None, px_abs_benefit_percentile = None, + is_spatial = False, lr_policy = 'constant', lr_params = {'lr': 10.0}, @@ -78,7 +85,7 @@ def __str__(self): class FindResults(object): - def __init__(self): + def __init__(self,batch_index): self.ii = [] self.obj = [] self.idxmax = [] @@ -94,6 +101,7 @@ def __init__(self): self.last_obj = None self.last_xx = None self.meta_result = None + self.batch_index = batch_index def update(self, params, ii, acts, idxmax, xx, x0): assert params.push_dir > 0, 'push_dir < 0 not yet supported' @@ -137,7 +145,7 @@ def trim_arrays(self): def __str__(self): ret = StringIO.StringIO() - print >>ret, 'FindResults:' + print >>ret, 'FindResults[%d]:' % self.batch_index for key in sorted(self.__dict__.keys()): val = self.__dict__[key] if isinstance(val, list) and len(val) > 4: @@ -154,64 +162,85 @@ def __str__(self): class GradientOptimizer(object): '''Finds images by gradient.''' - def __init__(self, net, data_mean, labels = None, label_layers = None, channel_swap_to_rgb = None): + def __init__(self, settings, net, batched_data_mean, labels = None, label_layers = [], channel_swap_to_rgb = None): + self.settings = settings self.net = net - self.data_mean = data_mean + self.batched_data_mean = batched_data_mean self.labels = labels if labels else ['labels not provided' for ii in range(1000)] - self.label_layers = label_layers if label_layers else tuple() + self.label_layers = label_layers if label_layers else list() + self.siamese_helper = SiameseHelper(self.settings.layers_list) if channel_swap_to_rgb: self.channel_swap_to_rgb = array(channel_swap_to_rgb) else: - data_n_channels = self.data_mean.shape[0] - self.channel_swap_to_rgb = arange(data_n_channels) # Don't change order - self._data_mean_rgb_img = self.data_mean[self.channel_swap_to_rgb].transpose((1,2,0)) # Store as (227,227,3) in RGB order. + if settings._calculated_is_gray_model: + self.channel_swap_to_rgb = arange(1) + else: + self.channel_swap_to_rgb = arange(3) # Don't change order + + # since we have a batch of same data mean images, we can just take the first + if batched_data_mean is not None: + self._data_mean_rgb_img = self.batched_data_mean[0, self.channel_swap_to_rgb].transpose((1,2,0)) # Store as (227,227,3) in RGB order. + else: + self._data_mean_rgb_img = None - def run_optimize(self, params, prefix_template = None, brave = False, skipbig = False): + def run_optimize(self, params, prefix_template = None, brave = False, skipbig = False, skipsmall = False): '''All images are in Caffe format, e.g. shape (3, 227, 227) in BGR order.''' print '\n\nStarting optimization with the following parameters:' print params x0 = self._get_x0(params) - xx, results = self._optimize(params, x0) - self.save_results(params, results, prefix_template, brave = brave, skipbig = skipbig) - - print results.meta_result + xx, results, results_generated = self._optimize(params, x0, prefix_template) + if results_generated: + self.save_results(params, results, prefix_template, brave = brave, skipbig = skipbig, skipsmall = skipsmall) + print str([results[batch_index].meta_result for batch_index in range(params.batch_size)]) return xx def _get_x0(self, params): '''Chooses a starting location''' - + np.random.seed(params.rand_seed) + input_shape = self.net.blobs['data'].data.shape + if params.start_at == 'mean_plus_rand': - x0 = np.random.normal(0, 10, self.data_mean.shape) + x0 = np.random.normal(0, 10, input_shape) elif params.start_at == 'randu': - x0 = uniform(0, 255, self.data_mean.shape) - self.data_mean + if self.batched_data_mean is not None: + x0 = uniform(0, 255, input_shape) - self.batched_data_mean + else: + x0 = uniform(0, 255, input_shape) elif params.start_at == 'mean': - x0 = zeros(self.data_mean.shape) + x0 = zeros(input_shape) else: raise Exception('Unknown start conditions: %s' % params.start_at) return x0 - def _optimize(self, params, x0): + def _optimize(self, params, x0, prefix_template): xx = x0.copy() - xx = xx[newaxis,:] # Promote 3D -> 4D - results = FindResults() + results = [FindResults(batch_index) for batch_index in range(params.batch_size)] + + # check if all required outputs exist, in which case skip this optimization + all_outputs = [self.generate_output_names(batch_index, params, results, prefix_template, self.settings.caffevis_outputs_dir) for batch_index in range(params.batch_size)] + relevant_outputs = [best_X_name for [best_X_name, best_Xpm_name, majority_X_name, majority_Xpm_name, info_name, info_pkl_name, info_big_pkl_name] in all_outputs] + relevant_outputs_exist = [os.path.exists(best_X_name) for best_X_name in relevant_outputs] + if all(relevant_outputs_exist): + return xx, results, False # Whether or not the unit being optimized corresponds to a label (e.g. one of the 1000 imagenet classes) is_labeled_unit = params.push_layer in self.label_layers # Sanity checks for conv vs FC layers - data_shape = self.net.blobs[params.push_layer].data.shape + top_name = layer_name_to_top_name(self.net, params.push_layer) + data_shape = self.net.blobs[top_name].data.shape assert len(data_shape) in (2,4), 'Expected shape of length 2 (for FC) or 4 (for conv) layers but shape is %s' % repr(data_shape) - is_conv = (len(data_shape) == 4) + is_spatial = (len(data_shape) == 4) - if is_conv: + if is_spatial: if params.push_spatial == (0,0): recommended_spatial = (data_shape[2]/2, data_shape[3]/2) print ('WARNING: A unit on a conv layer (%s) is being optimized, but push_spatial\n' @@ -227,171 +256,312 @@ def _optimize(self, params, x0): push_label = self.labels[params.push_unit[0]] else: push_label = None - + + old_obj = np.zeros(params.batch_size) + obj = np.zeros(params.batch_size) for ii in range(params.max_iter): # 0. Crop data - xx = minimum(255.0, maximum(0.0, xx + self.data_mean)) - self.data_mean # Crop all values to [0,255] - - + if self.batched_data_mean is not None: + xx = minimum(255.0, maximum(0.0, xx + self.batched_data_mean)) - self.batched_data_mean # Crop all values to [0,255] + else: + xx = minimum(255.0, maximum(0.0, xx)) # Crop all values to [0,255] # 1. Push data through net out = self.net.forward_all(data = xx) #shownet(net) - acts = self.net.blobs[params.push_layer].data[0] # chop off batch dimension + top_name = layer_name_to_top_name(self.net, params.push_layer) + acts = self.net.blobs[top_name].data - if not is_conv: - # promote to 3D - acts = acts[:,np.newaxis,np.newaxis] - idxmax = unravel_index(acts.argmax(), acts.shape) - valmax = acts.max() - # idxmax for fc or prob layer will be like: (278, 0, 0) - # idxmax for conv layer will be like: (37, 4, 37) - obj = acts[params.push_unit] + layer_format = self.siamese_helper.get_layer_format_by_layer_name(params.push_layer) - - # 2. Update results - results.update(params, ii, acts, idxmax, xx[0], x0) + # note: no batch support in 'siamese_batch_pair' + if self.settings.is_siamese and layer_format == 'siamese_batch_pair' and acts.shape[0] == 2: + + if not is_spatial: + # promote to 4D + acts = np.reshape(acts, (2, -1, 1, 1)) + reshaped_acts = np.reshape(acts, (2, -1)) + idxmax = unravel_index(reshaped_acts.argmax(axis=1), acts.shape[1:]) + valmax = reshaped_acts.max(axis=1) + + # idxmax for fc or prob layer will be like: (batch,278, 0, 0) + # idxmax for conv layer will be like: (batch,37, 4, 37) + obj[0] = acts[0, params.push_unit[0], params.push_unit[1], params.push_unit[2]] + + elif self.settings.is_siamese and layer_format == 'siamese_batch_pair' and acts.shape[0] == 1: + + if not is_spatial: + # promote to 4D + acts = np.reshape(acts, (1, -1, 1, 1)) + reshaped_acts = np.reshape(acts, (1, -1)) + idxmax = unravel_index(reshaped_acts.argmax(axis=1), acts.shape[1:]) + valmax = reshaped_acts.max(axis=1) + + # idxmax for fc or prob layer will be like: (batch,278, 0, 0) + # idxmax for conv layer will be like: (batch,37, 4, 37) + obj[0] = acts[0, params.push_unit[0], params.push_unit[1], params.push_unit[2]] - - # 3. Print progress - if ii > 0: - if params.lr_policy == 'progress': - print '%-4d progress predicted: %g, actual: %g' % (ii, pred_prog, obj - old_obj) - else: - print '%-4d progress: %g' % (ii, obj - old_obj) else: - print '%d' % ii - old_obj = obj + if not is_spatial: + # promote to 4D + acts = np.reshape(acts, (params.batch_size, -1, 1, 1)) + reshaped_acts = np.reshape(acts, (params.batch_size, -1)) + idxmax = unravel_index(reshaped_acts.argmax(axis=1), acts.shape[1:]) + valmax = reshaped_acts.max(axis=1) - push_label_str = ('(%s)' % push_label) if is_labeled_unit else '' - max_label_str = ('(%s)' % self.labels[idxmax[0]]) if is_labeled_unit else '' - print ' push unit: %16s with value %g %s' % (params.push_unit, acts[params.push_unit], push_label_str) - print ' Max idx: %16s with value %g %s' % (idxmax, valmax, max_label_str) - print ' X:', xx.min(), xx.max(), norm(xx) + # idxmax for fc or prob layer will be like: (batch,278, 0, 0) + # idxmax for conv layer will be like: (batch,37, 4, 37) + obj = acts[np.arange(params.batch_size), params.push_unit[0], params.push_unit[1], params.push_unit[2]] + + # 2. Update results + for batch_index in range(params.batch_size): + results[batch_index].update(params, ii, acts[batch_index], \ + (idxmax[0][batch_index],idxmax[1][batch_index],idxmax[2][batch_index]), \ + xx[batch_index], x0[batch_index]) + + # 3. Print progress + if ii > 0: + if params.lr_policy == 'progress': + print 'iter %-4d batch_index %d progress predicted: %g, actual: %g' % (ii, batch_index, pred_prog[batch_index], obj[batch_index] - old_obj[batch_index]) + else: + print 'iter %-4d batch_index %d progress: %g' % (ii, batch_index, obj[batch_index] - old_obj[batch_index]) + else: + print 'iter %d batch_index %d' % (ii, batch_index) + old_obj[batch_index] = obj[batch_index] + + push_label_str = ('(%s)' % push_label) if is_labeled_unit else '' + max_label_str = ('(%s)' % self.labels[idxmax[0][batch_index]]) if is_labeled_unit else '' + print ' push unit: %16s with value %g %s' % (params.push_unit, acts[batch_index][params.push_unit], push_label_str) + print ' Max idx: %16s with value %g %s' % ((idxmax[0][batch_index],idxmax[1][batch_index],idxmax[2][batch_index]), valmax[batch_index], max_label_str) + print ' X:', xx[batch_index].min(), xx[batch_index].max(), norm(xx[batch_index]) # 4. Do backward pass to get gradient - diffs = self.net.blobs[params.push_layer].diff * 0 - if not is_conv: + top_name = layer_name_to_top_name(self.net, params.push_layer) + diffs = self.net.blobs[top_name].diff * 0 + if not is_spatial: # Promote bc -> bc01 diffs = diffs[:,:,np.newaxis,np.newaxis] - diffs[0][params.push_unit] = params.push_dir - backout = self.net.backward_from_layer(params.push_layer, diffs if is_conv else diffs[:,:,0,0]) - grad = backout['data'].copy() - print ' grad:', grad.min(), grad.max(), norm(grad) - if norm(grad) == 0: - print 'Grad exactly 0, failed' - results.meta_result = 'Metaresult: grad 0 failure' - break + if self.settings.is_siamese and layer_format == 'siamese_batch_pair' and acts.shape[0] == 2: + diffs[0, params.push_unit[0], params.push_unit[1], params.push_unit[2]] = params.push_dir + elif self.settings.is_siamese and layer_format == 'siamese_batch_pair' and acts.shape[0] == 1: + diffs[0, params.push_unit[0], params.push_unit[1], params.push_unit[2]] = params.push_dir + else: + diffs[np.arange(params.batch_size), params.push_unit[0], params.push_unit[1], params.push_unit[2]] = params.push_dir + backout = self.net.backward_from_layer(params.push_layer, diffs if is_spatial else diffs[:,:,0,0]) + grad = backout['data'].copy() + reshaped_grad = np.reshape(grad, (params.batch_size, -1)) + norm_grad = np.linalg.norm(reshaped_grad, axis=1) + min_grad = np.amin(reshaped_grad, axis=1) + max_grad = np.amax(reshaped_grad, axis=1) + + for batch_index in range(params.batch_size): + print ' layer: %s, channel: %d, batch_index: %d min grad: %f, max grad: %f, norm grad: %f' % (params.push_layer, params.push_unit[0], batch_index, min_grad[batch_index], max_grad[batch_index], norm_grad[batch_index]) + if norm_grad[batch_index] == 0: + print ' batch_index: %d, Grad exactly 0, failed' % batch_index + results[batch_index].meta_result = 'Metaresult: grad 0 failure' + break # 5. Pick gradient update per learning policy if params.lr_policy == 'progress01': # Useful for softmax layer optimization, taper off near 1 late_prog = params.lr_params['late_prog_mult'] * (1-obj) - desired_prog = min(params.lr_params['early_prog'], late_prog) - prog_lr = desired_prog / norm(grad)**2 - lr = min(params.lr_params['max_lr'], prog_lr) - print ' desired progress:', desired_prog, 'prog_lr:', prog_lr, 'lr:', lr - pred_prog = lr * dot(grad.flatten(), grad.flatten()) + desired_prog = np.amin(np.stack((np.repeat(params.lr_params['early_prog'], params.batch_size), late_prog), axis=1), axis=1) + prog_lr = desired_prog / np.square(norm_grad) + lr = np.amin(np.stack((np.repeat(params.lr_params['max_lr'], params.batch_size), prog_lr), axis=1), axis=1) + print ' entire batch, desired progress:', desired_prog, 'prog_lr:', prog_lr, 'lr:', lr + pred_prog = lr * np.sum(np.abs(reshaped_grad) ** 2, axis=-1) elif params.lr_policy == 'progress': # straight progress-based lr - prog_lr = params.lr_params['desired_prog'] / norm(grad)**2 - lr = min(params.lr_params['max_lr'], prog_lr) - print ' desired progress:', params.lr_params['desired_prog'], 'prog_lr:', prog_lr, 'lr:', lr - pred_prog = lr * dot(grad.flatten(), grad.flatten()) + prog_lr = params.lr_params['desired_prog'] / (norm_grad**2) + lr = np.amin(np.stack((np.repeat(params.lr_params['max_lr'], params.batch_size), prog_lr), axis=1), axis=1) + print ' entire batch, desired progress:', params.lr_params['desired_prog'], 'prog_lr:', prog_lr, 'lr:', lr + pred_prog = lr * np.sum(np.abs(reshaped_grad) ** 2, axis=-1) elif params.lr_policy == 'constant': # constant fixed learning rate - lr = params.lr_params['lr'] + lr = np.repeat(params.lr_params['lr'], params.batch_size) else: - raise Exception('Unimlemented lr_policy') + raise Exception('Unimplemented lr_policy') + + for batch_index in range(params.batch_size): + + # 6. Apply gradient update and regularizations + if ii < params.max_iter-1: + # Skip gradient and regularizations on the very last step (so the above printed info is valid for the last step) + xx[batch_index] += lr[batch_index] * grad[batch_index] + xx[batch_index] *= (1 - params.decay) + + channels = xx.shape[1] + + if params.blur_every is not 0 and params.blur_radius > 0: + if params.blur_radius < .3: + print 'Warning: blur-radius of .3 or less works very poorly' + #raise Exception('blur-radius of .3 or less works very poorly') + if ii % params.blur_every == 0: + for channel in range(channels): + cimg = gaussian_filter(xx[batch_index,channel], params.blur_radius) + xx[batch_index,channel] = cimg + if params.small_val_percentile > 0: + small_entries = (abs(xx[batch_index]) < percentile(abs(xx[batch_index]), params.small_val_percentile)) + xx[batch_index] = xx[batch_index] - xx[batch_index]*small_entries # e.g. set smallest 50% of xx to zero + + if params.small_norm_percentile > 0: + pxnorms = norm(xx[batch_index,np.newaxis,:,:,:], axis=1) + smallpx = pxnorms < percentile(pxnorms, params.small_norm_percentile) + smallpx3 = tile(smallpx[:,newaxis,:,:], (1,channels,1,1)) + xx[batch_index,:,:,:] = xx[batch_index,np.newaxis,:,:,:] - xx[batch_index,np.newaxis,:,:,:]*smallpx3 + + if params.px_benefit_percentile > 0: + pred_0_benefit = grad[batch_index,np.newaxis,:,:,:] * -xx[batch_index,np.newaxis,:,:,:] + px_benefit = pred_0_benefit.sum(1) # sum over color channels + smallben = px_benefit < percentile(px_benefit, params.px_benefit_percentile) + smallben3 = tile(smallben[:,newaxis,:,:], (1,channels,1,1)) + xx[batch_index,:,:,:] = xx[batch_index,np.newaxis,:,:,:] - xx[batch_index,np.newaxis,:,:,:]*smallben3 + + if params.px_abs_benefit_percentile > 0: + pred_0_benefit = grad[batch_index,np.newaxis,:,:,:] * -xx[batch_index,np.newaxis,:,:,:] + px_benefit = pred_0_benefit.sum(1) # sum over color channels + smallaben = abs(px_benefit) < percentile(abs(px_benefit), params.px_abs_benefit_percentile) + smallaben3 = tile(smallaben[:,newaxis,:,:], (1,channels,1,1)) + xx[batch_index,:,:,:] = xx[batch_index,np.newaxis,:,:,:] - xx[batch_index,np.newaxis,:,:,:]*smallaben3 + + print ' timestamp:', datetime.datetime.now() + + for batch_index in range(params.batch_size): + if results[batch_index].meta_result is None: + if results[batch_index].majority_obj is not None: + results[batch_index].meta_result = 'batch_index: %d, Metaresult: majority success' % batch_index + else: + results[batch_index].meta_result = 'batch_index: %d, Metaresult: majority failure' % batch_index - - # 6. Apply gradient update and regularizations - if ii < params.max_iter-1: - # Skip gradient and regularizations on the very last step (so the above printed info is valid for the last step) - xx += lr * grad - xx *= (1 - params.decay) - - if params.blur_every is not 0 and params.blur_radius > 0: - if params.blur_radius < .3: - print 'Warning: blur-radius of .3 or less works very poorly' - #raise Exception('blur-radius of .3 or less works very poorly') - if ii % params.blur_every == 0: - for channel in range(3): - cimg = gaussian_filter(xx[0,channel], params.blur_radius) - xx[0,channel] = cimg - if params.small_val_percentile > 0: - small_entries = (abs(xx) < percentile(abs(xx), params.small_val_percentile)) - xx = xx - xx*small_entries # e.g. set smallest 50% of xx to zero - - if params.small_norm_percentile > 0: - pxnorms = norm(xx, axis=1) - smallpx = pxnorms < percentile(pxnorms, params.small_norm_percentile) - smallpx3 = tile(smallpx[:,newaxis,:,:], (1,3,1,1)) - xx = xx - xx*smallpx3 - - if params.px_benefit_percentile > 0: - pred_0_benefit = grad * -xx - px_benefit = pred_0_benefit.sum(1) # sum over color channels - smallben = px_benefit < percentile(px_benefit, params.px_benefit_percentile) - smallben3 = tile(smallben[:,newaxis,:,:], (1,3,1,1)) - xx = xx - xx*smallben3 - - if params.px_abs_benefit_percentile > 0: - pred_0_benefit = grad * -xx - px_benefit = pred_0_benefit.sum(1) # sum over color channels - smallaben = abs(px_benefit) < percentile(abs(px_benefit), params.px_abs_benefit_percentile) - smallaben3 = tile(smallaben[:,newaxis,:,:], (1,3,1,1)) - xx = xx - xx*smallaben3 - - if results.meta_result is None: - if results.majority_obj is not None: - results.meta_result = 'Metaresult: majority success' - else: - results.meta_result = 'Metaresult: majority failure' + return xx, results, True - return xx, results + def find_selected_input_index(self, layer_name): - def save_results(self, params, results, prefix_template, brave = False, skipbig = False): - if prefix_template is None: - return + for item in self.settings.layers_list: + + # if we have only a single layer, the header is the layer name + if item['format'] == 'normal' and item['name/s'] == layer_name: + return -1 + + # if we got a pair of layers + elif item['format'] == 'siamese_layer_pair': + + if item['name/s'][0] == layer_name: + return 0 + + if item['name/s'][1] == layer_name: + return 1 + + elif item['format'] == 'siamese_batch_pair' and item['name/s'] == layer_name: + return 0 + + return -1 + + def generate_output_names(self, batch_index, params, results, prefix_template, output_dir): results_and_params = combine_dicts((('p.', params.__dict__), - ('r.', results.__dict__))) + ('r.', results[batch_index].__dict__))) prefix = prefix_template % results_and_params - + + prefix = os.path.join(output_dir, prefix) + if os.path.isdir(prefix): if prefix[-1] != '/': - prefix += '/' # append slash for dir-only template + prefix += '/' # append slash for dir-only template else: dirname = os.path.dirname(prefix) if dirname: mkdir_p(dirname) - # Don't overwrite previous results - if os.path.exists('%sinfo.txt' % prefix) and not brave: - raise Exception('Cowardly refusing to overwrite ' + '%sinfo.txt' % prefix) - - output_majority = False - if output_majority: - if results.majority_xx is not None: - asimg = results.majority_xx[self.channel_swap_to_rgb].transpose((1,2,0)) - saveimagescc('%smajority_X.jpg' % prefix, asimg, 0) - saveimagesc('%smajority_Xpm.jpg' % prefix, asimg + self._data_mean_rgb_img) # PlusMean - - if results.best_xx is not None: - asimg = results.best_xx[self.channel_swap_to_rgb].transpose((1,2,0)) - saveimagescc('%sbest_X.jpg' % prefix, asimg, 0) - saveimagesc('%sbest_Xpm.jpg' % prefix, asimg + self._data_mean_rgb_img) # PlusMean - - with open('%sinfo.txt' % prefix, 'w') as ff: - print >>ff, params - print >>ff - print >>ff, results - if not skipbig: - with open('%sinfo_big.pkl' % prefix, 'w') as ff: - pickle.dump((params, results), ff, protocol=-1) - results.trim_arrays() - with open('%sinfo.pkl' % prefix, 'w') as ff: - pickle.dump((params, results), ff, protocol=-1) + best_X_name = '%s_best_X.jpg' % prefix + best_Xpm_name = '%s_best_Xpm.jpg' % prefix + majority_X_name = '%s_majority_X.jpg' % prefix + majority_Xpm_name = '%s_majority_Xpm.jpg' % prefix + info_name = '%s_info.txt' % prefix + info_pkl_name = '%s_info.pkl' % prefix + info_big_pkl_name = '%s_info_big.pkl' % prefix + return [best_X_name, best_Xpm_name, majority_X_name, majority_Xpm_name, info_name, info_pkl_name, info_big_pkl_name] + + def save_results(self, params, results, prefix_template, brave = False, skipbig = False, skipsmall = False): + if prefix_template is None: + return + + for batch_index in range(params.batch_size): + + [best_X_name, best_Xpm_name, majority_X_name, majority_Xpm_name, info_name, info_pkl_name, info_big_pkl_name] = \ + self.generate_output_names(batch_index, params, results, prefix_template, self.settings.caffevis_outputs_dir) + + # Don't overwrite previous results + if os.path.exists(info_name) and not brave: + raise Exception('Cowardly refusing to overwrite ' + info_name) + + output_majority = False + if output_majority: + # NOTE: this section wasn't tested after changes to code, so some minor index tweaking are in order + if results[batch_index].majority_xx is not None: + asimg = results[batch_index].majority_xx[self.channel_swap_to_rgb].transpose((1,2,0)) + saveimagescc(majority_X_name, asimg, 0) + saveimagesc(majority_Xpm_name, asimg + self._data_mean_rgb_img) # PlusMean + + if results[batch_index].best_xx is not None: + # results[batch_index].best_xx.shape is (6,224,224) + + def save_output(data, channel_swap_to_rgb, best_X_image_name): + # , best_Xpm_image_name, data_mean_rgb_img): + asimg = data[channel_swap_to_rgb].transpose((1, 2, 0)) + saveimagescc(best_X_image_name, asimg, 0) + + # get center position, relative to layer, of best maximum + [temp_ii, temp_jj] = results[batch_index].idxmax[results[batch_index].best_ii][1:3] + + is_spatial = params.is_spatial + layer_name = params.push_layer + size_ii, size_jj = get_max_data_extent(self.net, self.settings, layer_name, is_spatial) + data_size_ii, data_size_jj = self.net.blobs['data'].data.shape[2:4] + + [out_ii_start, out_ii_end, out_jj_start, out_jj_end, data_ii_start, data_ii_end, data_jj_start, data_jj_end] = \ + compute_data_layer_focus_area(is_spatial, temp_ii, temp_jj, self.settings, layer_name, size_ii, size_jj, data_size_ii, data_size_jj) + + selected_input_index = self.find_selected_input_index(layer_name) + + out_arr = extract_patch_from_image(results[batch_index].best_xx, self.net, selected_input_index, self.settings, + data_ii_end, data_ii_start, data_jj_end, data_jj_start, + out_ii_end, out_ii_start, out_jj_end, out_jj_start, size_ii, size_jj) + + if self.settings.is_siamese: + save_output(out_arr, + channel_swap_to_rgb=self.channel_swap_to_rgb[[0, 1, 2]], + best_X_image_name=best_X_name) + else: + save_output(out_arr, + channel_swap_to_rgb=self.channel_swap_to_rgb, + best_X_image_name=best_X_name) + + if self.settings.optimize_image_generate_plus_mean: + out_arr_pm = extract_patch_from_image(results[batch_index].best_xx + self.batched_data_mean, self.net, selected_input_index, self.settings, + data_ii_end, data_ii_start, data_jj_end, data_jj_start, + out_ii_end, out_ii_start, out_jj_end, out_jj_start, size_ii, size_jj) + + if self.settings.is_siamese: + save_output(out_arr_pm, + channel_swap_to_rgb=self.channel_swap_to_rgb[[0, 1, 2]], + best_X_image_name=best_Xpm_name) + else: + save_output(out_arr_pm, + channel_swap_to_rgb=self.channel_swap_to_rgb, + best_X_image_name=best_Xpm_name) + + with open(info_name, 'w') as ff: + print >>ff, params + print >>ff + print >>ff, results[batch_index] + if not skipbig: + with open(info_big_pkl_name, 'w') as ff: + pickle.dump((params, results[batch_index]), ff, protocol=-1) + if not skipsmall: + results[batch_index].trim_arrays() + with open(info_pkl_name, 'w') as ff: + pickle.dump((params, results[batch_index]), ff, protocol=-1) + diff --git a/optimize_image.py b/optimize_image.py index a555d9e02..93b8fe6a9 100755 --- a/optimize_image.py +++ b/optimize_image.py @@ -1,5 +1,9 @@ #! /usr/bin/env python +# this import must comes first to make sure we use the non-display backend +import matplotlib +matplotlib.use('Agg') + import os import sys import argparse @@ -7,14 +11,17 @@ import settings from optimize.gradient_optimizer import GradientOptimizer, FindParams -from caffevis.caffevis_helper import check_force_backward_true, read_label_file +from caffevis.caffevis_helper import read_label_file, set_mean +from settings_misc import load_network +from caffe_misc import layer_name_to_top_name + LR_POLICY_CHOICES = ('constant', 'progress', 'progress01') def get_parser(): - parser = argparse.ArgumentParser(description='Script to find, with or without regularization, images that cause high or low activations of specific neurons in a network via numerical optimization. Settings are read from settings.py, overridden in settings_local.py, and may be further overridden on the command line.', + parser = argparse.ArgumentParser(description='Script to find, with or without regularization, images that cause high or low activations of specific neurons in a network via numerical optimization. Settings are read from settings.py, overridden in settings_MODEL.py and settings_user.py, and may be further overridden on the command line.', formatter_class=lambda prog: argparse.ArgumentDefaultsHelpFormatter(prog, width=100) ) @@ -25,8 +32,6 @@ def get_parser(): help = 'Path to caffe network prototxt.') parser.add_argument('--net-weights', type = str, default = settings.caffevis_network_weights, help = 'Path to caffe network weights.') - parser.add_argument('--mean', type = str, default = repr(settings.caffevis_data_mean), - help = '''Mean. The mean may be None, a tuple of one mean value per channel, or a string specifying the path to a mean image to load. Because of the multiple datatypes supported, this argument must be specified as a string that evaluates to a valid Python object. For example: "None", "(10,20,30)", and "'mean.npy'" are all valid values. Note that to specify a string path to a mean file, it must be passed with quotes, which usually entails passing it with double quotes in the shell! Alternately, just provide the mean in settings_local.py.''') parser.add_argument('--channel-swap-to-rgb', type = str, default = '(2,1,0)', help = 'Permutation to apply to channels to change to RGB space for plotting. Hint: (0,1,2) if your network is trained for RGB, (2,1,0) if it is trained for BGR.') parser.add_argument('--data-size', type = str, default = '(227,227)', @@ -37,12 +42,14 @@ def get_parser(): # Where to start parser.add_argument('--start-at', type = str, default = 'mean_plus_rand', choices = ('mean_plus_rand', 'randu', 'mean'), help = 'How to generate x0, the initial point used in optimization.') - parser.add_argument('--rand-seed', type = int, default = 0, + parser.add_argument('--rand-seed', type = int, default = settings.optimize_image_rand_seed, help = 'Random seed used for generating the start-at image (use different seeds to generate different images).') + parser.add_argument('--batch-size', type=int, default=settings.optimize_image_batch_size, + help = 'Batch size used for generating several images, each index will be used as random seed') # What to optimize - parser.add_argument('--push-layer', type = str, default = 'fc8', - help = 'Name of layer that contains the desired neuron whose value is optimized.') + parser.add_argument('--push-layers', type = list, default = settings.layers_to_output_in_offline_scripts, + help = 'Name of layers that contains the desired neuron whose value is optimized.') parser.add_argument('--push-channel', type = int, default = '130', help = 'Channel number for desired neuron whose value is optimized (channel for conv, neuron index for FC).') parser.add_argument('--push-spatial', type = str, default = 'None', @@ -51,11 +58,11 @@ def get_parser(): help = 'Which direction to push the activation of the selected neuron, that is, the value used to begin backprop. For example, use 1 to maximize the selected neuron activation and -1 to minimize it.') # Use regularization? - parser.add_argument('--decay', type = float, default = 0, + parser.add_argument('--decay', type = float, default = settings.optimize_image_decay, help = 'Amount of L2 decay to use.') - parser.add_argument('--blur-radius', type = float, default = 0, + parser.add_argument('--blur-radius', type = float, default = settings.optimize_image_blur_radius, help = 'Radius in pixels of blur to apply after each BLUR_EVERY steps. If 0, perform no blurring. Blur sizes between 0 and 0.3 work poorly.') - parser.add_argument('--blur-every', type = int, default = 0, + parser.add_argument('--blur-every', type = int, default = settings.optimize_image_blue_every, help = 'Blur every BLUR_EVERY steps. If 0, perform no blurring.') parser.add_argument('--small-val-percentile', type = float, default = 0, help = 'Induce sparsity by setting pixels with absolute value under SMALL_VAL_PERCENTILE percentile to 0. Not discussed in paper. 0 to disable.') @@ -67,21 +74,19 @@ def get_parser(): help = 'Induce sparsity by setting pixels with contribution under PX_BENEFIT_PERCENTILE percentile to 0. \\theta_{c_pct} from the paper. 0 to disable.') # How much to optimize - parser.add_argument('--lr-policy', type = str, default = 'constant', choices = LR_POLICY_CHOICES, + parser.add_argument('--lr-policy', type = str, default = settings.optimize_image_lr_policy, choices = LR_POLICY_CHOICES, help = 'Learning rate policy. See description in lr-params.') - parser.add_argument('--lr-params', type = str, default = '{"lr": 1}', + parser.add_argument('--lr-params', type = str, default = settings.optimize_image_lr_params, help = 'Learning rate params, specified as a string that evalutes to a Python dict. Params that must be provided dependon which lr-policy is selected. The "constant" policy requires the "lr" key and uses the constant given learning rate. The "progress" policy requires the "max_lr" and "desired_prog" keys and scales the learning rate such that the objective function will change by an amount equal to DESIRED_PROG under a linear objective assumption, except the LR is limited to MAX_LR. The "progress01" policy requires the "max_lr", "early_prog", and "late_prog_mult" keys and is tuned for optimizing neurons with outputs in the [0,1] range, e.g. neurons on a softmax layer. Under this policy optimization slows down as the output approaches 1 (see code for details).') - parser.add_argument('--max-iter', type = int, default = 500, - help = 'Number of iterations of the optimization loop.') + parser.add_argument('--max-iters', type = list, default = settings.optimize_image_max_iters, + help = 'List of number of iterations of the optimization loop.') # Where to save results - parser.add_argument('--output-prefix', type = str, default = 'optimize_results/opt', - help = 'Output path and filename prefix (default: optimize_results/opt)') - parser.add_argument('--output-template', type = str, default = '%(p.push_layer)s_%(p.push_channel)04d_%(p.rand_seed)d', - help = 'Output filename template; see code for details (default: "%%(p.push_layer)s_%%(p.push_channel)04d_%%(p.rand_seed)d"). ' - 'The default output-prefix and output-template produce filenames like "optimize_results/opt_prob_0278_0_best_X.jpg"') - parser.add_argument('--brave', action = 'store_true', help = 'Allow overwriting existing results files. Default: off, i.e. cowardly refuse to overwrite existing files.') - parser.add_argument('--skipbig', action = 'store_true', help = 'Skip outputting large *info_big.pkl files (contains pickled version of x0, last x, best x, first x that attained max on the specified layer.') + parser.add_argument('--output-prefix', type = str, default = settings.optimize_image_output_prefix, + help = 'Output path and filename prefix (default: outputs/%(p.push_layer)s/unit_%(p.push_channel)04d/opt_%(r.batch_index)03d)') + parser.add_argument('--brave', action = 'store_true', default=True, help = 'Allow overwriting existing results files. Default: off, i.e. cowardly refuse to overwrite existing files.') + parser.add_argument('--skipbig', action = 'store_true', default=True, help = 'Skip outputting large *info_big.pkl files (contains pickled version of x0, last x, best x, first x that attained max on the specified layer.') + parser.add_argument('--skipsmall', action = 'store_true', default=True, help = 'Skip outputting small *info.pkl files (contains pickled version of..') return parser @@ -129,93 +134,98 @@ def parse_and_validate_push_spatial(parser, push_spatial): return push_spatial - def main(): parser = get_parser() args = parser.parse_args() # Finish parsing args - channel_swap_to_rgb = eval(args.channel_swap_to_rgb) - assert isinstance(channel_swap_to_rgb, tuple) and len(channel_swap_to_rgb) > 0, 'channel_swap_to_rgb should be a tuple' - data_size = eval(args.data_size) - assert isinstance(data_size, tuple) and len(data_size) == 2, 'data_size should be a length 2 tuple' - #channel_swap_inv = tuple([net_channel_swap.index(ii) for ii in range(len(net_channel_swap))]) lr_params = parse_and_validate_lr_params(parser, args.lr_policy, args.lr_params) push_spatial = parse_and_validate_push_spatial(parser, args.push_spatial) - - # Load mean - data_mean = eval(args.mean) - - if isinstance(data_mean, basestring): - # If the mean is given as a filename, load the file - try: - data_mean = np.load(data_mean) - except IOError: - print '\n\nCound not load mean file:', data_mean - print 'To fetch a default model and mean file, use:\n' - print ' $ cd models/caffenet-yos/' - print ' $ cp ./fetch.sh\n\n' - print 'Or to use your own mean, change caffevis_data_mean in settings_local.py or override by running with `--mean MEAN_FILE` (see --help).\n' - raise - # Crop center region (e.g. 227x227) if mean is larger (e.g. 256x256) - excess_h = data_mean.shape[1] - data_size[0] - excess_w = data_mean.shape[2] - data_size[1] - assert excess_h >= 0 and excess_w >= 0, 'mean should be at least as large as %s' % repr(data_size) - data_mean = data_mean[:, (excess_h/2):(excess_h/2+data_size[0]), (excess_w/2):(excess_w/2+data_size[1])] - elif data_mean is None: - pass - else: - # The mean has been given as a value or a tuple of values - data_mean = np.array(data_mean) - # Promote to shape C,1,1 - while len(data_mean.shape) < 3: - data_mean = np.expand_dims(data_mean, -1) - - print 'Using mean:', repr(data_mean) - - # Load network - sys.path.insert(0, os.path.join(args.caffe_root, 'python')) - import caffe - net = caffe.Classifier( - args.deploy_proto, - args.net_weights, - mean = data_mean, - raw_scale = 1.0, - ) - check_force_backward_true(settings.caffevis_deploy_prototxt) + + settings.caffevis_deploy_prototxt = args.deploy_proto + settings.caffevis_network_weights = args.net_weights + + net, data_mean = load_network(settings) + + # validate batch size + if settings.is_siamese and settings.siamese_network_format == 'siamese_batch_pair': + # currently, no batch support for siamese_batch_pair networks + # it can be added by simply handle the batch indexes properly, but it should be thoroughly tested + assert (settings.max_tracker_batch_size == 1) + + current_data_shape = net.blobs['data'].shape + net.blobs['data'].reshape(args.batch_size, current_data_shape[1], current_data_shape[2], current_data_shape[3]) + net.reshape() labels = None if settings.caffevis_labels: labels = read_label_file(settings.caffevis_labels) - optimizer = GradientOptimizer(net, data_mean, labels = labels, - label_layers = settings.caffevis_label_layers, - channel_swap_to_rgb = channel_swap_to_rgb) - - params = FindParams( - start_at = args.start_at, - rand_seed = args.rand_seed, - push_layer = args.push_layer, - push_channel = args.push_channel, - push_spatial = push_spatial, - push_dir = args.push_dir, - decay = args.decay, - blur_radius = args.blur_radius, - blur_every = args.blur_every, - small_val_percentile = args.small_val_percentile, - small_norm_percentile = args.small_norm_percentile, - px_benefit_percentile = args.px_benefit_percentile, - px_abs_benefit_percentile = args.px_abs_benefit_percentile, - lr_policy = args.lr_policy, - lr_params = lr_params, - max_iter = args.max_iter, - ) - - prefix_template = '%s_%s_' % (args.output_prefix, args.output_template) - im = optimizer.run_optimize(params, prefix_template = prefix_template, - brave = args.brave, skipbig = args.skipbig) + if data_mean is not None: + if len(data_mean.shape) == 3: + batched_data_mean = np.repeat(data_mean[np.newaxis, :, :, :], args.batch_size, axis=0) + elif len(data_mean.shape) == 1: + data_mean = data_mean[np.newaxis,:,np.newaxis,np.newaxis] + batched_data_mean = np.tile(data_mean, (args.batch_size,1,current_data_shape[2],current_data_shape[3])) + else: + batched_data_mean = data_mean + optimizer = GradientOptimizer(settings, net, batched_data_mean, labels = labels, + label_layers = settings.caffevis_label_layers, + channel_swap_to_rgb = settings.caffe_net_channel_swap) + + if not args.push_layers: + print "ERROR: No layers to work on, please set layers_to_output_in_offline_scripts to list of layers" + return + + # go over push layers + for count, push_layer in enumerate(args.push_layers): + + top_name = layer_name_to_top_name(net, push_layer) + blob = net.blobs[top_name].data + is_spatial = (len(blob.shape) == 4) + channels = blob.shape[1] + + # get layer definition + layer_def = settings._layer_name_to_record[push_layer] + + if is_spatial: + push_spatial = (layer_def.filter[0] / 2, layer_def.filter[1] / 2) + else: + push_spatial = (0, 0) + + # if channels defined in settings file, use them + if settings.optimize_image_channels: + channels_list = settings.optimize_image_channels + else: + channels_list = range(channels) + + # go over channels + for current_channel in channels_list: + params = FindParams( + start_at = args.start_at, + rand_seed = args.rand_seed, + batch_size = args.batch_size, + push_layer = push_layer, + push_channel = current_channel, + push_spatial = push_spatial, + push_dir = args.push_dir, + decay = args.decay, + blur_radius = args.blur_radius, + blur_every = args.blur_every, + small_val_percentile = args.small_val_percentile, + small_norm_percentile = args.small_norm_percentile, + px_benefit_percentile = args.px_benefit_percentile, + px_abs_benefit_percentile = args.px_abs_benefit_percentile, + lr_policy = args.lr_policy, + lr_params = lr_params, + max_iter = args.max_iters[count % len(args.max_iters)], + is_spatial = is_spatial, + ) + + optimizer.run_optimize(params, prefix_template = args.output_prefix, + brave = args.brave, skipbig = args.skipbig, skipsmall = args.skipsmall) if __name__ == '__main__': diff --git a/run_toolbox.py b/run_toolbox.py index b342fd4ae..7731fcbdd 100755 --- a/run_toolbox.py +++ b/run_toolbox.py @@ -1,5 +1,9 @@ #! /usr/bin/env python +# this import must comes first to make sure we use the non-display backend +import matplotlib +matplotlib.use('Agg') + import os from live_vis import LiveVis from bindings import bindings @@ -7,11 +11,8 @@ import settings except: print '\nError importing settings.py. Check the error message below for more information.' - print "If you haven't already, you'll want to copy one of the settings_local.template-*.py files" - print 'to settings_local.py and edit it to point to your caffe checkout. E.g. via:' - print - print ' $ cp models/caffenet-yos/settings_local.template-caffenet-yos.py settings_local.py' - print ' $ < edit settings_local.py >\n' + print "If you haven't already, you'll want to open the settings_model_selector.py file" + print 'and edit it to point to your caffe checkout.\n' raise if not os.path.exists(settings.caffevis_caffe_root): diff --git a/run_webui.py b/run_webui.py new file mode 100644 index 000000000..8e41ba7aa --- /dev/null +++ b/run_webui.py @@ -0,0 +1,69 @@ +#! /usr/bin/env python + +import os +import thread +from live_vis import LiveVis +from bindings import bindings +try: + import settings +except: + print '\nError importing settings.py. Check the error message below for more information.' + print "If you haven't already, you'll want to open the settings_model_selector.py file" + print 'and edit it to point to your caffe checkout.\n' + raise + +if not os.path.exists(settings.caffevis_caffe_root): + raise Exception('ERROR: Set caffevis_caffe_root in settings.py first.') + +import cv2 +from flask import Flask, render_template, Response + +app = Flask(__name__) + +@app.route('/') +def index(): + return render_template('index.html') + + +def gen(): + while True: + frame = get_frame() + yield (b'--frame\r\n' + b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n') + + +def get_frame(): + # We are using Motion JPEG, but OpenCV defaults to capture raw images, + # so we must encode it into JPEG in order to correctly display the + # video stream. + + global lv + + ret, jpeg = cv2.imencode('.jpg', lv.window_buffer[:,:,::-1]) + return jpeg.tobytes() + + +@app.route('/video_feed') +def video_feed(): + return Response(gen(), mimetype='multipart/x-mixed-replace; boundary=frame') + +if __name__ == '__main__': + + global lv + + def someFunc(): + print "someFunc was called" + lv.run_loop() + + + if os.environ.get("WERKZEUG_RUN_MAIN") == "true": + # The reloader has already run - do what you want to do here + + lv = LiveVis(settings) + help_keys, _ = bindings.get_key_help('help_mode') + quit_keys, _ = bindings.get_key_help('quit') + print '\n\nRunning toolbox. Push %s for help or %s to quit.\n\n' % (help_keys[0], quit_keys[0]) + + thread.start_new_thread(someFunc, ()) + + app.run(host='127.0.0.1', debug=True) diff --git a/settings.py b/settings.py index 69c8a51bb..b8f7db7ce 100644 --- a/settings.py +++ b/settings.py @@ -1,19 +1,17 @@ # Settings for Deep Visualization Toolbox # # Note: Probably don't change anything in this file. To override -# settings, define them in settings_local.py rather than changing them -# here. +# settings, define them in your network specific settings file or settings_user.py rather than changing them here. +# Import network settings. Turn off creation of X.pyc to avoid stale settings if X.py is removed. import os import sys - -# Import local / overridden settings. Turn off creation of settings_local.pyc to avoid stale settings if settings_local.py is removed. sys.dont_write_bytecode = True try: - from settings_local import * + from settings_model_selector import * except ImportError: - if not os.path.exists('settings_local.py'): - raise Exception('Could not import settings_local. Did you create it from the template? See README and start with:\n\n $ cp models/caffenet-yos/settings_local.template-caffenet-yos.py settings_local.py') + if not os.path.exists('settings_model_selector.py'): + raise Exception('Could not import settings_model_selector.py') else: raise # Resume usual pyc creation @@ -27,6 +25,9 @@ # #################################### +# base folder for paths defined in the settings +base_folder = locals().get('base_folder', '') + # Which device to use for webcam input. On Mac the default device, 0, # works for builtin camera or external USB webcam, if plugged in. If # you have multiple cameras, you might need to update this value. To @@ -37,7 +38,7 @@ input_updater_sleep_after_read_frame = locals().get('input_updater_sleep_after_read_frame', 1.0/20) # Input updater thread die after this many seconds without a heartbeat. Useful during debugging to avoid other threads running after main thread has crashed. -input_updater_heartbeat_required = locals().get('input_updater_heartbeat_required', 15.0) +input_updater_heartbeat_required = locals().get('input_updater_heartbeat_required', 150.0 if __debug__ else 15.0) # How long to sleep while waiting for key presses and redraws. Recommendation: 1 (min: 1) main_loop_sleep_ms = locals().get('main_loop_sleep_ms', 1) @@ -66,16 +67,21 @@ # height of the control panel (to accomodate varying length of layer # names), one can simply define control_pane_height. If more if 'default_window_panes' in locals(): - raise Exception('Override window panes in settings_local.py by defining window_panes, not default_window_panes') + raise Exception('Override window panes in settings_MODEL.py by defining window_panes, not default_window_panes') + +# set the following member to make control pane height constant, instead of automatic +# control_pane_height = locals().get('control_pane_height', 3*20) + default_window_panes = ( # (i, j, i_size, j_size) - ('input', ( 0, 0, 300, 300)), # This pane is required to show the input picture - ('caffevis_aux', (300, 0, 300, 300)), - ('caffevis_back', (600, 0, 300, 300)), - ('caffevis_status', (900, 0, 30, 1500)), - ('caffevis_control', ( 0, 300, 30, 900)), - ('caffevis_layers', ( 30, 300, 870, 900)), - ('caffevis_jpgvis', ( 0, 1200, 900, 300)), + ('input', ( 0, 0, 300, 300)), # This pane is required to show the input picture + ('caffevis_aux', ( 300, 0, 300, 300)), + ('caffevis_back', ( 600, 0, 300, 300)), + ('caffevis_status', ( 900, 0, 2*20 + 10, 1500)), + ('caffevis_control', ( 0, 300, 3*20, 900)), + ('caffevis_layers', (3*20, 300, 900-3*20, 900)), + ('caffevis_jpgvis', ( 0, 1200, 900, 300)), + ('caffevis_buttons', ( 0, 1500, 900, 300)), ) window_panes = locals().get('window_panes', default_window_panes) @@ -107,14 +113,24 @@ static_files_ignore_case = locals().get('static_files_ignore_case', True) # True to stretch to square, False to crop to square. (Can change at # runtime via 'stretch_mode' key.) -static_file_stretch_mode = locals().get('static_file_stretch_mode', False) +static_file_stretch_mode = locals().get('static_file_stretch_mode', True) + +# is the network loaded a siamese network +is_siamese = locals().get('is_siamese', False) + +# siamese input mode, can be either 'concat_channelwise' or 'concat_along_width' +siamese_input_mode = locals().get('siamese_input_mode', 'concat_channelwise') -# contains the input mode for reading static images, can be: 'directory', 'image_list', 'siamese_image_list' +# contains the input mode for reading static images, can be: 'directory', 'image_list' static_files_input_mode = locals().get('static_files_input_mode', 'directory') -# contains the file name to read, relevant only when static_files_input_mode is 'image_list' or 'siamese_image_list' +# contains the file name to read, relevant only when static_files_input_mode is 'image_list' static_files_input_file = locals().get('static_files_input_file', 'images_file_list.txt') +# set to True if the model expects grayscale inputs, False otherwise. +# If value is None we set this parameter according to the network structure +is_gray_model = locals().get('is_gray_model', None) + # int, 0+. How many times to go through the main loop after a keypress # before resuming handling frames (0 to handle every frame as it # arrives). Setting this to a value > 0 can enable more responsive @@ -164,11 +180,17 @@ # Whether to use GPU mode (if True) or CPU mode (if False) caffevis_mode_gpu = locals().get('caffevis_mode_gpu', True) +# ID of GPU to use, default is 0 +caffevis_gpu_id = locals().get('caffevis_gpu_id', 0) + # Data mean, if any, to be subtracted from input image file / webcam # image. Specify as string path to file or tuple of one value per # channel or None. caffevis_data_mean = locals().get('caffevis_data_mean', None) +# should we generate the channelwise average of the input mean file +generate_channelwise_mean = locals().get('generate_channelwise_mean', False) + # Path to file listing labels in order, one per line, used for the # below two features. None to disable. caffevis_labels = locals().get('caffevis_labels', None) @@ -176,22 +198,24 @@ # Which layers have channels/neurons corresponding to the order given # in the caffevis_labels file? Annotate these units with label text # (when those neurons are selected). None to disable. -caffevis_label_layers = locals().get('caffevis_label_layers', None) +caffevis_label_layers = locals().get('caffevis_label_layers', []) # Which layer to use for displaying class output numbers in left pane # (when no neurons are selected). None to disable. caffevis_prob_layer = locals().get('caffevis_prob_layer', None) -# String or None. Which directory to load pre-computed per-unit -# visualizations from, if any. None to disable. -caffevis_unit_jpg_dir = locals().get('caffevis_unit_jpg_dir', None) +# what is the folder format for loading precomputed visualizations, +# options are: +# "original_combined_single_image" - every unit has a single layer +# "max_tracker_output" - every unit has a list of images to be loaded +caffevis_outputs_dir_folder_format = locals().get('caffevis_outputs_dir_folder_format', 'max_tracker_output') # List. For which layers should jpgs be loaded for # visualization? If a layer name (full name, not prettified) is given # here, we will try to load jpgs to visualize each unit. This is used # for pattern mode ('s' key by default) and for the right -# caffevis_jpgvis pane ('9' key by default). Empty list to disable. -caffevis_jpgvis_layers = locals().get('caffevis_jpgvis_layers', []) +# caffevis_jpgvis pane ('9' key by default). None disables filtering (thus taking all), empty list to tries nothing +caffevis_jpgvis_layers = locals().get('caffevis_jpgvis_layers', None) # Dict specifying string:string mapping. Steal pattern mode and right # jpgvis pane visualizations for certain layers (e.g. pool1) from @@ -219,8 +243,15 @@ caffevis_data_mean = caffevis_data_mean.replace('%DVT_ROOT%', dvt_root) if isinstance(caffevis_labels, basestring): caffevis_labels = caffevis_labels.replace('%DVT_ROOT%', dvt_root) -if isinstance(caffevis_unit_jpg_dir, basestring): - caffevis_unit_jpg_dir = caffevis_unit_jpg_dir.replace('%DVT_ROOT%', dvt_root) +if isinstance(caffevis_outputs_dir, basestring): + caffevis_outputs_dir = caffevis_outputs_dir.replace('%DVT_ROOT%', dvt_root) +if isinstance(static_files_input_file, basestring): + static_files_input_file = static_files_input_file.replace('%DVT_ROOT%', dvt_root) +if isinstance(static_files_dir, basestring): + static_files_dir = static_files_dir.replace('%DVT_ROOT%', dvt_root) + + + # Pause Caffe forward/backward computation for this many seconds after a keypress. This is to keep the processor free for a brief period after a keypress, which allow the interface to feel much more responsive. After this period has passed, Caffe resumes computation, in CPU mode often occupying all cores. Default: .1 caffevis_pause_after_keys = locals().get('caffevis_pause_after_keys', .10) @@ -229,14 +260,14 @@ # CaffeProc thread dies after this many seconds without a # heartbeat. Useful during debugging to avoid other threads running # after main thread has crashed. -caffevis_heartbeat_required = locals().get('caffevis_heartbeat_required', 15.0) +caffevis_heartbeat_required = locals().get('caffevis_heartbeat_required', 150.0 if __debug__ else 30.0) # How far to move when using fast left/right/up/down keys caffevis_fast_move_dist = locals().get('caffevis_fast_move_dist', 3) # Size of jpg reading cache in bytes (default: 2GB) # Note: largest fc6/fc7 images are ~600MB. Cache smaller than this will be painfully slow when using patterns_mode for fc6 and fc7. # Cache use when all layers have been loaded is ~1.6GB -caffevis_jpg_cache_size = locals().get('caffevis_jpg_cache_size', 2000*1024**2) +caffevis_jpg_cache_size = locals().get('caffevis_jpg_cache_size', 4000*1024**2) caffevis_grad_norm_blur_radius = locals().get('caffevis_grad_norm_blur_radius', 4.0) @@ -259,6 +290,8 @@ # Initially show jpg vis or not (toggle with default key '9') caffevis_init_show_unit_jpgs = locals().get('caffevis_init_show_unit_jpgs', True) +caffevis_keep_aspect_ratio = locals().get('caffevis_keep_aspect_ratio', False) + # extra pixel spacing between lines. Default: 4 = not much space / tight layout caffevis_control_line_spacing = locals().get('caffevis_control_line_spacing', 4) # Font settings for control pane (list of layers) @@ -288,6 +321,23 @@ caffevis_status_thick = locals().get('caffevis_status_thick', 1) caffevis_jpgvis_stack_vert = locals().get('caffevis_jpgvis_stack_vert', True) +# Font settings for buttons pane (left most pane) +caffevis_buttons_header_face = locals().get('caffevis_buttons_header_face', 'FONT_HERSHEY_COMPLEX_SMALL') +caffevis_buttons_header_fsize = locals().get('caffevis_buttons_header_fsize', 1.0 * global_font_size) +caffevis_buttons_header_clr = locals().get('caffevis_buttons_header_clr', (.8,.8,.8)) +caffevis_buttons_header_thick = locals().get('caffevis_buttons_header_thick', 2) +caffevis_buttons_normal_face = locals().get('caffevis_buttons_normal_face', 'FONT_HERSHEY_COMPLEX_SMALL') +caffevis_buttons_normal_fsize = locals().get('caffevis_buttons_normal_fsize', 1.0 * global_font_size) +caffevis_buttons_normal_clr = locals().get('caffevis_buttons_normal_clr', (.8,.8,.8)) +caffevis_buttons_normal_thick = locals().get('caffevis_buttons_normal_thick', 1) +caffevis_buttons_selected_face = locals().get('caffevis_buttons_selected_face', 'FONT_HERSHEY_COMPLEX_SMALL') +caffevis_buttons_selected_fsize = locals().get('caffevis_buttons_selected_fsize', 1.0 * global_font_size) +caffevis_buttons_selected_clr = locals().get('caffevis_buttons_selected_clr', (.5,1,.5)) +caffevis_buttons_selected_thick = locals().get('caffevis_buttons_selected_thick', 1) + +caffevis_buttons_loc = locals().get('caffevis_buttons_loc', (15,10)) # r,c order +caffevis_buttons_line_spacing = locals().get('caffevis_buttons_line_spacing', 10) # extra pixel spacing between lines + # Font settings for class prob output (top 5 classes listed on left) caffevis_class_face = locals().get('caffevis_class_face', 'FONT_HERSHEY_COMPLEX_SMALL') caffevis_class_loc = locals().get('caffevis_class_loc', (20,10)) # r,c order @@ -304,9 +354,105 @@ caffevis_label_fsize = locals().get('caffevis_label_fsize', 1.0 * global_font_size) caffevis_label_thick = locals().get('caffevis_label_thick', 1) -# caffe net parameter - channel swap -caffe_net_channel_swap = locals().get('caffe_net_channel_swap', (2,1,0)) +# Font settings for score overlay text (shown on maximal images on rightmost pane) +caffevis_score_face = locals().get('caffevis_score_face', 5) # this is a hacky way to use FONT_HERSHEY_COMPLEX_SMALL +caffevis_score_loc = locals().get('caffevis_score_loc', (20,10)) # r,c order +caffevis_score_clr = locals().get('caffevis_score_clr', (.5,1,.5)) +caffevis_score_fsize = locals().get('caffevis_score_fsize', 1.0 * global_font_size) +caffevis_score_thick = locals().get('caffevis_score_thick', 1) + +# how should histograms be loaded: 'calculate_in_realtime' or 'load_from_file' +caffevis_histograms_format = locals().get('caffevis_histograms_format','load_from_file') + +# should we black maximal input images with zero or negative activation score +caffevis_clear_negative_activations = locals().get('caffevis_clear_negative_activations', False) + +# folder for generating and reading deep vis outputs +caffevis_outputs_dir = locals().get('caffevis_outputs_dir', '.') + +# caffe net parameter - channel swap, default is None which will make automatic decision according to other settings +# the automatic setting is either (2,1,0) or (2,1,0,5,4,3) according to is_siamese value and siamese_input_mode +caffe_net_channel_swap = locals().get('caffe_net_channel_swap', None) + +# caffe net parameter - transpose, used to convert HxWxK to KxHxW, when None uses caffe default which is (2,0,1) +# this parameter should rarely change +caffe_net_transpose = locals().get('caffe_net_transpose', None) + +# caffe net parameter - raw scale, multiplies input BEFORE mean subtraction +caffe_net_raw_scale = locals().get('caffe_net_raw_scale', 255.0) + +# caffe net parameter - input scale, multiplies input AFTER mean subtraction +caffe_net_input_scale = locals().get('caffe_net_input_scale', None) + +# caffe net parameter - image dims +caffe_net_image_dims = locals().get('caffe_net_image_dims', None) + +# default value for do_maxes parameter in max_tracker +max_tracker_do_maxes = locals().get('max_tracker_do_maxes', True) +# default value for do_deconv parameter in max tracker +max_tracker_do_deconv = locals().get('max_tracker_do_deconv', True) + +# default value for do_deconv_norm parameter in max tracker +max_tracker_do_deconv_norm = locals().get('max_tracker_do_deconv_norm', False) + +# default value for do_backprop parameter in max tracker +max_tracker_do_backprop = locals().get('max_tracker_do_backprop', False) + +# default value for do_backprop_norm parameter in max tracker +max_tracker_do_backprop_norm = locals().get('max_tracker_do_backprop_norm', False) + +# default value for do_info parameter in max tracker +max_tracker_do_info = locals().get('max_tracker_do_info', True) + +# default value for do_histograms parameter in max tracker +max_tracker_do_histograms = locals().get('max_tracker_do_histograms', True) + +# default value for do_correlation parameter in max tracker +max_tracker_do_correlation = locals().get('max_tracker_do_correlation', True) + +# default batch size used in max_tracker +max_tracker_batch_size = locals().get('max_tracker_batch_size', 1) + +# list of layers to output when using offlien scripts +layers_to_output_in_offline_scripts = locals().get('layers_to_output_in_offline_scripts', []) + +# list of siamese layers/blobs to show +# note: if an item in the list is a pair of layers, then it is a siamese layer +layers_list = locals().get('layers_list', []) + +# rand-seed parameter for optimize_image.py +optimize_image_rand_seed = locals().get('optimize_image_rand_seed', 0) + +# decay parameter for optimize_image.py +optimize_image_decay = locals().get('optimize_image_decay', 0.0001) + +# blur-radius parameter for optimize_image.py +optimize_image_blur_radius = locals().get('optimize_image_blur_radius', 1.0) + +# blur-every parameter for optimize_image.py +optimize_image_blue_every = locals().get('optimize_image_blue_every', 4) + +# lr-policy parameter for optimize_image.py +optimize_image_lr_policy = locals().get('optimize_image_lr_policy', 'constant') + +# lr-params parameter for optimize_image.py +optimize_image_lr_params = locals().get('optimize_image_lr_params', '{"lr": 100.0}') + +# max-iter parameter for optimize_image.py +optimize_image_max_iters = locals().get('optimize_image_max_iters', [1000]) + +# output-prefix parameter for optimize_image.py +optimize_image_output_prefix = locals().get('optimize_image_output_prefix', '%(p.push_layer)s/unit_%(p.push_channel)04d/opt_%(r.batch_index)03d_seed%(p.rand_seed)d') + +# parameter which marks whether we should generate also the plus mean image of the optmized image +optimize_image_generate_plus_mean = locals().get('optimize_image_generate_plus_mean', False) + +# batch size used in optimize_image.py +optimize_image_batch_size = locals().get('optimize_image_batch_size', 1) + +# channels to generate in optimize_image.py, if list is empty we generate all the channels in the layer +optimize_image_channels = locals().get('optimize_image_channels', []) #################################### # @@ -318,13 +464,19 @@ bound_locals = locals() def assert_in_settings(setting_name): if not setting_name in bound_locals: - raise Exception('The "%s" setting is required; be sure to define it in settings_local.py' % setting_name) + raise Exception('The "%s" setting is required; be sure to define it in settings_MODEL.py' % setting_name) +# Set this to point to your compiled checkout of caffe assert_in_settings('caffevis_caffe_root') + +# Path to caffe deploy prototxt file. Minibatch size should be 1. assert_in_settings('caffevis_deploy_prototxt') + +# Path to network weights to load. assert_in_settings('caffevis_network_weights') assert_in_settings('caffevis_data_mean') # Check that caffe directory actually exists if not os.path.exists(caffevis_caffe_root): - raise Exception('The Caffe directory specified in settings_local.py, %s, does not exist. Set the caffevis_caffe_root variable in your settings_local.py to the path of your compiled Caffe checkout.' % caffevis_caffe_root) + raise Exception('The Caffe directory specified in settings_model_selector.py, %s, does not exist. Set the caffevis_caffe_root variable in your settings_model_selector.py to the path of your compiled Caffe checkout.' % caffevis_caffe_root) + diff --git a/settings_misc.py b/settings_misc.py new file mode 100644 index 000000000..ebaba79ab --- /dev/null +++ b/settings_misc.py @@ -0,0 +1,300 @@ +import os,sys,inspect +currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) +parentdir = os.path.dirname(currentdir) +sys.path.insert(0,parentdir) + +import cPickle as pickle + +from caffevis.caffevis_helper import set_mean +from caffe_misc import layer_name_to_top_name, get_max_data_extent +from misc import mkdir_p + +def deduce_calculated_settings_without_network(settings): + set_calculated_siamese_network_format(settings) + set_calculated_channel_swap(settings) + read_network_dag(settings) + + +def deduce_calculated_settings_with_network(settings, net): + set_calculated_is_gray_model(settings, net) + set_calculated_image_dims(settings, net) + + +def set_calculated_is_gray_model(settings, net): + if settings.is_gray_model is not None: + settings._calculated_is_gray_model = settings.is_gray_model + else: + input_shape = net.blobs[net.inputs[0]].data.shape + channels = input_shape[1] + if channels == 1: + settings._calculated_is_gray_model = True + elif channels == 2 and settings.is_siamese: + settings._calculated_is_gray_model = True + elif channels == 3: + settings._calculated_is_gray_model = False + elif channels == 6 and settings.is_siamese: + settings._calculated_is_gray_model = False + else: + settings._calculated_is_gray_model = None + + +def set_calculated_image_dims(settings, net): + if settings.caffe_net_image_dims is not None: + settings._calculated_image_dims = settings.caffe_net_image_dims + else: + input_shape = net.blobs[net.inputs[0]].data.shape + settings._calculated_image_dims = input_shape[2:4] + + +def set_calculated_siamese_network_format(settings): + + settings._calculated_siamese_network_format = 'normal' + + for layer_def in settings.layers_list: + if layer_def['format'] != 'normal': + settings._calculated_siamese_network_format = layer_def['format'] + return + + +def set_calculated_channel_swap(settings): + + if settings.caffe_net_channel_swap is not None: + settings._calculated_channel_swap = settings.caffe_net_channel_swap + + else: + if settings.is_siamese and settings.siamese_input_mode == 'concat_channelwise': + settings._calculated_channel_swap = (2, 1, 0, 5, 4, 3) + + else: + settings._calculated_channel_swap = (2, 1, 0) + + +def process_network_proto(settings): + + settings._processed_deploy_prototxt = settings.caffevis_deploy_prototxt + ".processed_by_deepvis" + + # check if force_backwards is missing + found_force_backwards = False + with open(settings.caffevis_deploy_prototxt, 'r') as proto_file: + for line in proto_file: + fields = line.strip().split() + if len(fields) == 2 and fields[0] == 'force_backward:' and fields[1] == 'true': + found_force_backwards = True + break + + # write file, adding force_backward if needed + with open(settings.caffevis_deploy_prototxt, 'r') as proto_file: + with open(settings._processed_deploy_prototxt, 'w') as new_proto_file: + if not found_force_backwards: + new_proto_file.write('force_backward: true\n') + for line in proto_file: + new_proto_file.write(line) + + # run upgrade tool on new file name (same output file) + upgrade_tool_command_line = settings.caffevis_caffe_root + '/build/tools/upgrade_net_proto_text.bin ' + settings._processed_deploy_prototxt + ' ' + settings._processed_deploy_prototxt + os.system(upgrade_tool_command_line) + + return + + +def load_network(settings): + + # Set the mode to CPU or GPU. Note: in the latest Caffe + # versions, there is one Caffe object *per thread*, so the + # mode must be set per thread! Here we set the mode for the + # main thread; it is also separately set in CaffeProcThread. + sys.path.insert(0, os.path.join(settings.caffevis_caffe_root, 'python')) + import caffe + + if settings.caffevis_mode_gpu: + caffe.set_mode_gpu() + caffe.set_device(settings.caffevis_gpu_id) + print 'Loaded caffe in GPU mode, using device', settings.caffevis_gpu_id + + else: + caffe.set_mode_cpu() + print 'Loaded caffe in CPU mode' + + process_network_proto(settings) + + deduce_calculated_settings_without_network(settings) + + net = caffe.Classifier( + settings._processed_deploy_prototxt, + settings.caffevis_network_weights, + image_dims=settings.caffe_net_image_dims, + mean=None, # Set to None for now, assign later + input_scale=settings.caffe_net_input_scale, + raw_scale=settings.caffe_net_raw_scale, + channel_swap=settings._calculated_channel_swap) + + deduce_calculated_settings_with_network(settings, net) + + if settings.caffe_net_transpose: + net.transformer.set_transpose(net.inputs[0], settings.caffe_net_transpose) + + data_mean = set_mean(settings.caffevis_data_mean, settings.generate_channelwise_mean, net) + + return net, data_mean + + +class LayerRecord: + + def __init__(self, layer_def): + + self.layer_def = layer_def + self.name = layer_def.name + self.type = layer_def.type + + # keep filter, stride and pad + if layer_def.type == 'Convolution': + self.filter = list(layer_def.convolution_param.kernel_size) + if len(self.filter) == 1: + self.filter *= 2 + self.pad = list(layer_def.convolution_param.pad) + if len(self.pad) == 0: + self.pad = [0, 0] + elif len(self.pad) == 1: + self.pad *= 2 + self.stride = list(layer_def.convolution_param.stride) + if len(self.stride) == 0: + self.stride = [1, 1] + elif len(self.stride) == 1: + self.stride *= 2 + + elif layer_def.type == 'Pooling': + self.filter = [layer_def.pooling_param.kernel_size] + if len(self.filter) == 1: + self.filter *= 2 + self.pad = [layer_def.pooling_param.pad] + if len(self.pad) == 0: + self.pad = [0, 0] + elif len(self.pad) == 1: + self.pad *= 2 + self.stride = [layer_def.pooling_param.stride] + if len(self.stride) == 0: + self.stride = [1, 1] + elif len(self.stride) == 1: + self.stride *= 2 + + else: + self.filter = [0, 0] + self.pad = [0, 0] + self.stride = [1, 1] + + # keep tops + self.tops = list(layer_def.top) + + # keep bottoms + self.bottoms = list(layer_def.bottom) + + # list of parent layers + self.parents = [] + + # list of child layers + self.children = [] + + pass + + +def read_network_dag(settings): + from caffe.proto import caffe_pb2 + from google.protobuf import text_format + + # load prototxt file + network_def = caffe_pb2.NetParameter() + with open(settings._processed_deploy_prototxt, 'r') as proto_file: + text_format.Merge(str(proto_file.read()), network_def) + + # map layer name to layer record + layer_name_to_record = dict() + for layer_def in network_def.layer: + if (len(layer_def.include) == 0) or (caffe_pb2.TEST in [item.phase for item in layer_def.include]): + layer_name_to_record[layer_def.name] = LayerRecord(layer_def) + + top_to_layers = dict() + for layer in network_def.layer: + # no specific phase, or TEST phase is specifically asked for + if (len(layer.include) == 0) or (caffe_pb2.TEST in [item.phase for item in layer.include]): + for top in layer.top: + if top not in top_to_layers: + top_to_layers[top] = list() + top_to_layers[top].append(layer.name) + + # find parents and children of all layers + for child_layer_name in layer_name_to_record.keys(): + child_layer_def = layer_name_to_record[child_layer_name] + for bottom in child_layer_def.bottoms: + for parent_layer_name in top_to_layers[bottom]: + if parent_layer_name in layer_name_to_record: + parent_layer_def = layer_name_to_record[parent_layer_name] + if parent_layer_def not in child_layer_def.parents: + child_layer_def.parents.append(parent_layer_def) + if child_layer_def not in parent_layer_def.children: + parent_layer_def.children.append(child_layer_def) + + # update filter, strid, pad for maxout "structures" + for layer_name in layer_name_to_record.keys(): + layer_def = layer_name_to_record[layer_name] + if layer_def.type == 'Eltwise' and \ + len(layer_def.parents) == 1 and \ + layer_def.parents[0].type == 'Slice' and \ + len(layer_def.parents[0].parents) == 1 and \ + layer_def.parents[0].parents[0].type in ['Convolution', 'InnerProduct']: + layer_def.filter = layer_def.parents[0].parents[0].filter + layer_def.stride = layer_def.parents[0].parents[0].stride + layer_def.pad = layer_def.parents[0].parents[0].pad + + # keep helper variables in settings + settings._network_def = network_def + settings._layer_name_to_record = layer_name_to_record + + return + + +def _get_receptive_fields_cache_filename(settings): + return os.path.join(settings.caffevis_outputs_dir, 'receptive_fields_cache.pickled') + +def get_receptive_field(settings, net, layer_name): + + # flag which indicates whether the dictionary was changed hence we need to write it to cache + should_save_to_cache = False + + # check if dictionary exists + if not hasattr(settings, '_receptive_field_per_layer'): + + # if it doesn't, try load it from file + receptive_fields_cache_filename = _get_receptive_fields_cache_filename(settings) + if os.path.isfile(receptive_fields_cache_filename): + try: + with open(receptive_fields_cache_filename, 'rb') as receptive_fields_cache_file: + settings._receptive_field_per_layer = pickle.load(receptive_fields_cache_file) + except: + settings._receptive_field_per_layer = dict() + should_save_to_cache = True + else: + settings._receptive_field_per_layer = dict() + should_save_to_cache = True + + # calculate lazy + if not settings._receptive_field_per_layer.has_key(layer_name): + print "Calculating receptive fields for layer %s" % (layer_name) + top_name = layer_name_to_top_name(net, layer_name) + if top_name is not None: + blob = net.blobs[top_name].data + is_spatial = (len(blob.shape) == 4) + layer_receptive_field = get_max_data_extent(net, settings, layer_name, is_spatial) + settings._receptive_field_per_layer[layer_name] = layer_receptive_field + should_save_to_cache = True + + if should_save_to_cache: + try: + receptive_fields_cache_filename = _get_receptive_fields_cache_filename(settings) + mkdir_p(settings.caffevis_outputs_dir) + with open(receptive_fields_cache_filename, 'wb') as receptive_fields_cache_file: + pickle.dump(settings._receptive_field_per_layer, receptive_fields_cache_file, -1) + except IOError: + # ignore problems in cache saving + pass + + return settings._receptive_field_per_layer[layer_name] \ No newline at end of file diff --git a/settings_model_selector.py b/settings_model_selector.py new file mode 100644 index 000000000..b37cf17cb --- /dev/null +++ b/settings_model_selector.py @@ -0,0 +1,21 @@ + +# Import user settings. Turn off creation of X.pyc to avoid stale settings if X.py is removed. +import os +import sys +sys.dont_write_bytecode = True +try: + from settings_user import * +except ImportError: + if not os.path.exists('settings_user.py'): + raise Exception('Could not import settings_user. Did you create it from the template? Start with:\n\n $ cp settings_user_template.py settings_user.py') + else: + raise +# Resume usual pyc creation +sys.dont_write_bytecode = False + +caffevis_caffe_root = os.path.join(os.path.dirname(os.path.abspath(__file__)),'./caffe') + +# the following code runs dynamically the import command: +# from models.YOUR_MODEL.settings_YOUR_MODEL import * +import_code = 'from model_settings.settings_' + model_to_load + ' import *' +exec (import_code, globals()) diff --git a/settings_user.py.example b/settings_user.py.example new file mode 100644 index 000000000..0ecfe1ebb --- /dev/null +++ b/settings_user.py.example @@ -0,0 +1,12 @@ + +##################################### user related settings ##################################### + +# Use GPU? Default is True. +#caffevis_mode_gpu = False + +#caffevis_gpu_id = 0 + +# network selection, for network settings see file settings_local.py +model_to_load = 'caffenet_yos' # AlexNet +# model_to_load = 'bvlc_googlenet' +# model_to_load = 'squeezenet' diff --git a/siamese_helper.py b/siamese_helper.py new file mode 100644 index 000000000..3ea78d0c5 --- /dev/null +++ b/siamese_helper.py @@ -0,0 +1,514 @@ +import os +import numpy as np +from numpy import expand_dims, concatenate +from caffe_misc import layer_name_to_top_name +from image_misc import resize_without_fit + +class SiameseViewMode: + FIRST_IMAGE = 0 + SECOND_IMAGE = 1 + BOTH_IMAGES = 2 + NUMBER_OF_MODES = 3 + +class SiameseHelper(object): + '''helper class for handling all sorts of operations related to siamese networks + this class should encapsulate the different types of siamese network implementation''' + + def __init__(self, layers_list): + + # define class members + self.layers_list = layers_list + self.layer_name_to_normalized_layer_name = dict() + self.normalized_layer_name_to_denormalized_layer_name = dict() + self.layer_name_to_index_of_saved_image = dict() + self.layer_name_to_format = dict() + + # init dictionaries + self._init_layer_name_to_normalized_layer_name() + self._init_normalized_layer_name_to_denormalized_layer_name() + self._init_layer_name_to_index_of_saved_image() + self._init_layer_name_to_format() + + return + + def _init_layer_name_to_normalized_layer_name(self): + ''' + init layer_name_to_normalized_layer_name dictionary + :return: none + ''' + + for layer_def in self.layers_list: + + layer_format = layer_def['format'] + layer_names = layer_def['name/s'] + + if layer_format == 'normal': + self.layer_name_to_normalized_layer_name[layer_names] = layer_names + + elif layer_format == 'siamese_layer_pair': + self.layer_name_to_normalized_layer_name[layer_names[0]] = layer_names[0] + self.layer_name_to_normalized_layer_name[layer_names[1]] = layer_names[0] + + elif layer_format == 'siamese_batch_pair': + self.layer_name_to_normalized_layer_name[layer_names] = layer_names + + return + + def _init_normalized_layer_name_to_denormalized_layer_name(self): + ''' + init normalized_layer_name_to_denormalized_layer_name dictionary + :return: none + ''' + + for layer_def in self.layers_list: + + layer_format = layer_def['format'] + layer_names = layer_def['name/s'] + + if layer_format == 'normal': + self.normalized_layer_name_to_denormalized_layer_name[layer_names] = layer_names + + elif layer_format == 'siamese_layer_pair': + self.normalized_layer_name_to_denormalized_layer_name[layer_names[1]] = layer_names[1] + self.normalized_layer_name_to_denormalized_layer_name[layer_names[0]] = layer_names[1] + + elif layer_format == 'siamese_batch_pair': + self.normalized_layer_name_to_denormalized_layer_name[layer_names] = layer_names + + return + + def _init_layer_name_to_index_of_saved_image(self): + ''' + init layer_name_to_index_of_saved_image dictionary + :return: none + ''' + + for layer_def in self.layers_list: + + layer_format = layer_def['format'] + layer_names = layer_def['name/s'] + + if layer_format == 'normal': + self.layer_name_to_index_of_saved_image[layer_names] = -1 + + elif layer_format == 'siamese_layer_pair': + self.layer_name_to_index_of_saved_image[layer_names[0]] = 0 + self.layer_name_to_index_of_saved_image[layer_names[1]] = 1 + + elif layer_format == 'siamese_batch_pair': + self.layer_name_to_index_of_saved_image[layer_names] = -1 + + return + + def _init_layer_name_to_format(self): + ''' + init layer_name_to_format dictionary + :return: none + ''' + + for layer_def in self.layers_list: + layer_format = layer_def['format'] + layer_names = layer_def['name/s'] + + if layer_format == 'normal': + self.layer_name_to_format[layer_names] = layer_format + + elif layer_format == 'siamese_layer_pair': + self.layer_name_to_format[layer_names[0]] = layer_format + self.layer_name_to_format[layer_names[1]] = layer_format + + elif layer_format == 'siamese_batch_pair': + self.layer_name_to_format[layer_names] = layer_format + + def normalize_layer_name_for_max_tracker(self, layer_name): + ''' + function used to normalize layer name, e.g. 'conv1' and 'conv1_p' will be normalized to 'conv1', given suitable + layers_list setting. + :param layer_name: layer name to normalize + :return: normalized layer name + ''' + + if self.layer_name_to_normalized_layer_name.has_key(layer_name): + return self.layer_name_to_normalized_layer_name[layer_name] + + return layer_name + + def denormalize_layer_name_for_max_tracker(self, normalized_layer_name, selected_input_index): + ''' + function which returns the denormalized form of the layer name, given the normalized layer name and selected + input index which should be 0 or 1 + e.g. denormalize_layer_name_for_max_tracker('conv1', 1) == 'conv1_p', given suitable layers_list setting + :param normalized_layer_name: normalized layer name + :param selected_input_index: selected input index, 0 or 1 + :return: the denormalized layer name + ''' + + if selected_input_index == 0: + return normalized_layer_name + + if self.normalized_layer_name_to_denormalized_layer_name.has_key(normalized_layer_name): + return self.normalized_layer_name_to_denormalized_layer_name[normalized_layer_name] + + # this can happen for layer names which don't appear in the layers_list setting + return normalized_layer_name + + def get_index_of_saved_image_by_layer_name(self, layer_name): + ''' + function which returns the index of image to save (0 or 1) given the layer name + e.g. for conv1_p returns 1, for 'conv1' returns 0 + the decision is done using the layers_list setting used in max tracker + :param layer_name: layer name + :return index of image in the pair, 0 or 1: + ''' + + if self.layer_name_to_index_of_saved_image.has_key(layer_name): + return self.layer_name_to_index_of_saved_image[layer_name] + + # this can happen for layer names which don't appear in the layers_list setting + return -1 + + def get_layer_format_by_layer_name(self, layer_name): + ''' + function which returns the layer format given the layer name + :param layer_name: layer name + :return: layer format + ''' + + if self.layer_name_to_format.has_key(layer_name): + return self.layer_name_to_format[layer_name] + + # fallback to normal format + return 'normal' + + @staticmethod + def get_header_from_layer_def(layer_def): + ''' + helper function which returns the header name, given a single layer, or layer pair + :param layer: can be either single layer string, or apir of layers + :return: header for layer + ''' + + # if we have only a single layer, the header is the layer name + + if layer_def['format'] == 'normal': + return layer_def['name/s'] + + elif layer_def['format'] == 'siamese_layer_pair': + # build header in format: common_prefix + first_postfix | second_postfix + prefix = os.path.commonprefix(layer_def['name/s']) + prefix_len = len(prefix) + postfix0 = layer_def['name/s'][0][prefix_len:] + postfix1 = layer_def['name/s'][1][prefix_len:] + header_name = '%s%s|%s' % (prefix, postfix0, postfix1) + return header_name + + elif layer_def['format'] == 'siamese_batch_pair': + return layer_def['name/s'] + + @staticmethod + def get_default_layer_name(layer_def): + ''' + get layer name when the caller needs some 'default' choice + :param layer_def: layer definition object + :return: default_layer_name + ''' + + if layer_def['format'] == 'normal': + default_layer_name = layer_def['name/s'] + + elif layer_def['format'] == 'siamese_layer_pair': + default_layer_name = layer_def['name/s'][0] + + elif layer_def['format'] == 'siamese_batch_pair': + default_layer_name = layer_def['name/s'] + + else: + raise Exception("get_default_layer_name() got invalid layer_def['format']=%s" % layer_def['format']) + + return default_layer_name + + @staticmethod + def get_single_selected_layer_name(layer_def, siamese_view_mode): + + if layer_def['format'] == 'normal': + return layer_def['name/s'] + + elif layer_def['format'] == 'siamese_layer_pair': + if siamese_view_mode == SiameseViewMode.FIRST_IMAGE: + return layer_def['name/s'][0] + elif siamese_view_mode == SiameseViewMode.SECOND_IMAGE: + return layer_def['name/s'][1] + else: + raise Exception('in get_single_selected_blob() siamese_view_mode cant be BOTH') + + elif layer_def['format'] == 'siamese_batch_pair': + return layer_def['name/s'] + + else: + raise Exception("get_single_selected_blob() got invalid layer_def['format']=%s" % layer_def['format']) + + @staticmethod + def _get_single_selected_blob(net, layer_def, siamese_view_mode, blob_selector): + ''' + function used to extract the single selected blob according to the specified layer and siamese input mode and + blob selector. + note that it is invalid to call this function when siamese input mode is BOTH + this is the main function which contains logic on the siamese network internal format structure + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :param blob_selector: lambda function which lets us choose between data and diff blobs + :return: requested single blob + ''' + + if layer_def['format'] == 'normal': + return blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'])])[0] + + elif layer_def['format'] == 'siamese_layer_pair': + if siamese_view_mode == SiameseViewMode.FIRST_IMAGE: + selected_layer_name = layer_def['name/s'][0] + elif siamese_view_mode == SiameseViewMode.SECOND_IMAGE: + selected_layer_name = layer_def['name/s'][1] + else: + raise Exception('in get_single_selected_blob() siamese_view_mode cant be BOTH') + return blob_selector(net.blobs[layer_name_to_top_name(net, selected_layer_name)])[0] + + elif layer_def['format'] == 'siamese_batch_pair': + if siamese_view_mode == SiameseViewMode.FIRST_IMAGE: + selected_batch_index = 0 + elif siamese_view_mode == SiameseViewMode.SECOND_IMAGE: + selected_batch_index = 1 + else: + raise Exception('in get_single_selected_blob() siamese_view_mode cant be BOTH') + return blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'])])[selected_batch_index] + + else: + raise Exception("get_single_selected_blob() got invalid layer_def['format']=%s" % layer_def['format']) + + + @staticmethod + def get_single_selected_data_blob(net, layer_def, siamese_view_mode): + ''' + function used to extract the single selected DATA blob according to the specified layer and siamese input mode + note that it is invalid to call this function when siamese input mode is BOTH + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :return: requested single data blob + ''' + + return SiameseHelper._get_single_selected_blob(net, layer_def, siamese_view_mode, blob_selector=lambda layer_object: layer_object.data) + + @staticmethod + def get_single_selected_diff_blob(net, layer_def, siamese_view_mode): + ''' + function used to extract the single selected DIFF blob according to the specified layer and siamese input mode + note that it is invalid to call this function when siamese input mode is BOTH + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :return: requested single diff blob + ''' + + return SiameseHelper._get_single_selected_blob(net, layer_def, siamese_view_mode, blob_selector=lambda layer_object: layer_object.diff) + + @staticmethod + def _get_siamese_selected_blobs(net, layer_def, siamese_view_mode, blob_selector): + ''' + function used to extract both blobs according to the specified layer and siamese input mode and + blob selector. + this is the main function which contains logic on the siamese network internal format structure + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :param blob_selector: + :return: first_blob, second_blob + ''' + + if layer_def['format'] == 'normal': + raise Exception('function get_siamese_blobs() should not be called when layer is in normal format') + + elif layer_def['format'] == 'siamese_layer_pair': + return blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'][0])])[0], blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'][1])])[0] + + elif layer_def['format'] == 'siamese_batch_pair': + return blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'])])[0], blob_selector(net.blobs[layer_name_to_top_name(net, layer_def['name/s'])])[1] + + else: + raise Exception("get_siamese_blobs() got invalid layer_def['format']=%s" % layer_def['format']) + + @staticmethod + def get_siamese_selected_data_blobs(net, layer_def, siamese_view_mode): + ''' + function used to extract both DATA blobs according to the specified layer and siamese input mode + note that it is invalid to call this function when siamese input mode is not BOTH + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :return: first_blob, second_blob + ''' + + return SiameseHelper._get_siamese_selected_blobs(net, layer_def, siamese_view_mode, blob_selector=lambda layer_object: layer_object.data) + + @staticmethod + def get_siamese_selected_diff_blobs(net, layer_def, siamese_view_mode): + ''' + function used to extract both DIFF blobs according to the specified layer and siamese input mode + note that it is invalid to call this function when siamese input mode is not BOTH + :param net: network containing the blob to extract + :param layer_def: layer requested + :param siamese_view_mode: siamese view mode + :return: first_blob, second_blob + ''' + + return SiameseHelper._get_siamese_selected_blobs(net, layer_def, siamese_view_mode, blob_selector=lambda layer_object: layer_object.diff) + + @staticmethod + def is_pair_of_layers(layer_def): + + return layer_def['format'] in ['siamese_layer_pair', 'siamese_batch_pair'] + + @staticmethod + def siamese_view_mode_has_two_images(layer_def, siamese_view_mode): + ''' + helper function which checks whether the input mode is two images, and the provided layer contains two layer names + :param layer: can be a single string layer name, or a pair of layer names + :return: True if both the input mode is BOTH_IMAGES and layer contains two layer names, False oherwise + ''' + + return siamese_view_mode == SiameseViewMode.BOTH_IMAGES and SiameseHelper.is_pair_of_layers(layer_def) + + @staticmethod + def backward_from_layer(net, backprop_layer_def, backprop_unit, siamese_view_mode): + + # if we are in siamese_batch_pair, we don't care of siamese_view_mode since we must do deconv on the 2-batch + # otherwise, if we are in siamese_layer_pair, we do it on both layers only if backprop deconv are requested + if (backprop_layer_def['format'] == 'siamese_batch_pair') or \ + (backprop_layer_def['format'] == 'siamese_layer_pair' and siamese_view_mode == SiameseViewMode.BOTH_IMAGES): + + diffs0, diffs1 = SiameseHelper.get_siamese_selected_diff_blobs(net, backprop_layer_def, siamese_view_mode) + diffs0, diffs1 = diffs0 * 0, diffs1 * 0 + data0, data1 = SiameseHelper.get_siamese_selected_data_blobs(net, backprop_layer_def, siamese_view_mode) + diffs0[backprop_unit], diffs1[backprop_unit] = data0[backprop_unit], data1[backprop_unit] + + # add batch dimension + diffs0 = expand_dims(diffs0, 0) + diffs1 = expand_dims(diffs1, 0) + + if backprop_layer_def['format'] == 'siamese_layer_pair': + net.backward_from_layer(backprop_layer_def['name/s'][0], diffs0, zero_higher=True) + net.backward_from_layer(backprop_layer_def['name/s'][1], diffs1, zero_higher=True) + + elif backprop_layer_def['format'] == 'siamese_batch_pair': + # combine them to 2-batch and send once + diffs = concatenate((diffs0, diffs1), axis=0) + net.backward_from_layer(backprop_layer_def['name/s'], diffs, zero_higher=True) + else: + + diffs = SiameseHelper.get_single_selected_diff_blob(net, backprop_layer_def, siamese_view_mode) + diffs = diffs * 0 + data = SiameseHelper.get_single_selected_data_blob(net, backprop_layer_def, siamese_view_mode) + diffs[backprop_unit] = data[backprop_unit] + + # add batch dimension + diffs = expand_dims(diffs, 0) + + selected_backprop_layer_name = SiameseHelper.get_single_selected_layer_name(backprop_layer_def, siamese_view_mode) + net.backward_from_layer(selected_backprop_layer_name, diffs, zero_higher=True) + + pass + + @staticmethod + def deconv_from_layer(net, backprop_layer_def, backprop_unit, siamese_view_mode, deconv_type): + + # if we are in siamese_batch_pair, we don't care of siamese_view_mode since we must do deconv on the 2-batch + # otherwise, if we are in siamese_layer_pair, we do it on both layers only if both deconv are requested + if (backprop_layer_def['format'] == 'siamese_batch_pair') or \ + (backprop_layer_def['format'] == 'siamese_layer_pair' and siamese_view_mode == SiameseViewMode.BOTH_IMAGES): + + diffs0, diffs1 = SiameseHelper.get_siamese_selected_diff_blobs(net, backprop_layer_def, siamese_view_mode) + diffs0, diffs1 = diffs0 * 0, diffs1 * 0 + data0, data1 = SiameseHelper.get_siamese_selected_data_blobs(net, backprop_layer_def, siamese_view_mode) + diffs0[backprop_unit], diffs1[backprop_unit] = data0[backprop_unit], data1[backprop_unit] + + # add batch dimension + diffs0 = expand_dims(diffs0, 0) + diffs1 = expand_dims(diffs1, 0) + + if backprop_layer_def['format'] == 'siamese_layer_pair': + net.deconv_from_layer(backprop_layer_def['name/s'][0], diffs0, zero_higher=True, deconv_type=deconv_type) + net.deconv_from_layer(backprop_layer_def['name/s'][1], diffs1, zero_higher=True, deconv_type=deconv_type) + + elif backprop_layer_def['format'] == 'siamese_batch_pair': + # combine them to 2-batch and send once + diffs = concatenate((diffs0, diffs1), axis=0) + + net.deconv_from_layer(backprop_layer_def['name/s'], diffs, zero_higher=True, deconv_type=deconv_type) + + else: # normal layer, or siamese layer but siamese input mode is 'first' or 'second' + + diffs = SiameseHelper.get_single_selected_diff_blob(net, backprop_layer_def, siamese_view_mode) + diffs = diffs * 0 + data = SiameseHelper.get_single_selected_data_blob(net, backprop_layer_def, siamese_view_mode) + diffs[backprop_unit] = data[backprop_unit] + + # add batch dimension + diffs = expand_dims(diffs, 0) + + selected_backprop_layer_name = SiameseHelper.get_single_selected_layer_name(backprop_layer_def, siamese_view_mode) + + net.deconv_from_layer(selected_backprop_layer_name, diffs, zero_higher=True, deconv_type=deconv_type) + + @staticmethod + def get_image_from_frame(frame, is_siamese, image_shape, siamese_view_mode): + + if is_siamese and ((type(frame),len(frame)) == (tuple,2)): + + if siamese_view_mode == SiameseViewMode.BOTH_IMAGES: + frame1 = frame[0] + frame2 = frame[1] + half_pane_shape = (image_shape[0], image_shape[1]/2) + frame_disp1 = resize_without_fit(frame1[:], half_pane_shape) + frame_disp2 = resize_without_fit(frame2[:], half_pane_shape) + frame_disp = concatenate((frame_disp1, frame_disp2), axis=1) + + elif siamese_view_mode == SiameseViewMode.FIRST_IMAGE: + frame_disp = resize_without_fit(frame[0], image_shape) + + elif siamese_view_mode == SiameseViewMode.SECOND_IMAGE: + frame_disp = resize_without_fit(frame[1], image_shape) + + else: + frame_disp = resize_without_fit(frame[:], image_shape) + + return frame_disp + + @staticmethod + def convert_image_pair_to_network_input_format(frame_pair, resize_shape, siamese_input_mode): + + if siamese_input_mode == 'concat_channelwise': + im_small1 = resize_without_fit(frame_pair[0], resize_shape) + im_small2 = resize_without_fit(frame_pair[1], resize_shape) + im_small = np.concatenate((im_small1, im_small2), axis=2) + elif siamese_input_mode == 'concat_along_width': + half_input_dims = (resize_shape[0], resize_shape[1] / 2) + im_small1 = resize_without_fit(frame_pair[0], half_input_dims) + im_small2 = resize_without_fit(frame_pair[1], half_input_dims) + im_small = np.concatenate((im_small1, im_small2), axis=1) + + return im_small + + @staticmethod + def get_layer_output_size(net, is_siamese, layer_def, siamese_view_mode): + + if (layer_def['format'] == 'siamese_batch_pair') or (layer_def['format'] == 'siamese_layer_pair' and siamese_view_mode == SiameseViewMode.BOTH_IMAGES): + + data0, data1 = SiameseHelper.get_siamese_selected_data_blobs(net, layer_def, siamese_view_mode) + return data0.shape + + else: + + data = SiameseHelper.get_single_selected_data_blob(net, layer_def, siamese_view_mode) + return data.shape + + pass \ No newline at end of file diff --git a/templates/index.html b/templates/index.html new file mode 100644 index 000000000..b4dc90eff --- /dev/null +++ b/templates/index.html @@ -0,0 +1,9 @@ + + + Video Streaming Demonstration + + +

Video Streaming Demonstration

+ + + \ No newline at end of file