DeepLearning Caption (Mac)
Chainerで画像のキャプション生成 - Qiita
の記事を試す。
Ubuntuでやった時のメモ:
http://kubotti.hatenablog.com/entry/2016/06/07/142850
ソースコード取得。
https://github.com/dsanno/chainer-image-caption
git clone https://github.com/dsanno/chainer-image-caption.git
http://cs.stanford.edu/people/karpathy/deepimagesent/
から
Flickr8K (50MB), Flickr30K (200MB),
をダウンロード。
hdf5
$ pip install h5py Downloading/unpacking h5py Downloading h5py-2.6.0.tar.gz (245kB): 245kB downloaded Running setup.py (path:/Users/kubotad/.virtualenvs/ml/build/h5py/setup.py) egg_info for package h5py zip_safe flag not set; analyzing archive contents... Installed /Users/kubotad/.virtualenvs/ml/build/h5py/.eggs/pkgconfig-1.1.0-py2.7.egg Searching for Cython>=0.19 省略 .virtualenvs/ml/build/h5py/h5py/api_compat.h:27:10: fatal error: 'hdf5.h' file not found #include "hdf5.h" ^ 1 warning and 1 error generated. error: command 'cc' failed with exit status 1 ---------------------------------------- Cleaning up...
brew tap homebrew/science brew install hdf5 pip install h5py
train
GeForceのGPUが入っていないので、オプションを変える。
繰り返し5回(デフォルト100回)。
python src/train.py -s dataset.pkl -i vgg_feats.mat -o model/caption_gen --iter 5 word count: 2540 /Users/kubotad/.virtualenvs/ml/lib/python2.7/site-packages/chainer/functions/activation/lstm.py:15: RuntimeWarning: overflow encountered in exp return 1 / (1 + numpy.exp(-x)) epoch: 1 done train loss: 0.0477752295387 accuracy: 0.240047393365 test loss: 0.0503208395787 accuracy: 0.314788910424 epoch: 2 done train loss: 0.0407720221888 accuracy: 0.304062288422 test loss: 0.0463195297886 accuracy: 0.33422094885 epoch: 3 done train loss: 0.0389543428261 accuracy: 0.319383322049 test loss: 0.0449947251971 accuracy: 0.345344231904 epoch: 4 done train loss: 0.037905781938 accuracy: 0.330012976755 test loss: 0.0435648040157 accuracy: 0.356770877223 epoch: 5 done train loss: 0.0370137848389 accuracy: 0.339999435793 test loss: 0.0429580021762 accuracy: 0.36130445774
caffe model
ダウンロード。
http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel
変換。
python src/convert_caffemodel_to_pkl.py VGG_ILSVRC_19_layers.caffemodel vgg19.pkl
generate caption
python src/generate_caption.py -s dataset.pkl -i vgg19.pkl -m model/caption_gen_0004.model -l image/list.txt
chainer-image-caption% python src/generate_caption.py -s dataset.pkl -i vgg19.pkl -m model/caption_gen_0004.model -l image/list.txt # image/asakusa.jpg a man is sitting on a bench a group of people sit on a bench a group of people are sitting on a bench a group of people are standing in the snow a group of people are walking through the snow # image/tree.jpg a man is sitting on a bench a group of people sit on a bench a group of people are sitting on a bench a group of people sitting on a bench a group of people sit on a sidewalk