2018-08-23

RL勉強記録: Model-Free Control

ReinforcementLearning

www.youtube.com

Slides, Notes

David Silver先生のRLの講義、第5回目はModel-Free Control
やっとoptimal policyを求める手法まで来ました、ここまでちょい長かったです。

今回の講義では下記の手法が紹介されました。やっとQ-Learningまできましたね。
この辺もサラッと勉強すると違いとかモチベーションがわからなかったんですけど、今は各手法の違いが理解できました。

On policy Monte Carlo
On policy TD Learning
- Sarsa
Off policy
- Q-Learning

そろそろ手を動かして勉強したアルゴリズムを使ってみたいです。

2018-08-21

RL勉強記録: Model-Free Prediction

ReinforcementLearning

www.youtube.com

Slides, Notes

今日は Model-Free Prediction 、Control は次回のレクチャーとのこと。以下内容

Monter-Carlo (MC) Learning
- エピソードの最後まで見て、value function を更新
Temporal-Difference(TD) Learning
- TD(0)では次のステップまで見て、value function を更新
MC vs TD
- varience と bias のトレードオフ、MDPでは基本的にTDの方が効果的
TD(λ)
- TD(0) -> TD(∞) = MC
- averaging
  - n-steps returns を平均化することでよりロバストに
- λ-return
  - weighted geometric mean
  - λとαを調整してMCとTD(0) スイートスポットを見つける

MCとTDの違いを丁寧に例やイラストを入れて丁寧に説明してくれた回だった。
最後の方が駆け足だったけど、実装するときにもう一度振り返りたい。

2018-08-19

RL勉強記録: Planning by Dynamic Programming

ReinforcementLearning

今日はDavid Silver先生のUCL Course on RLのLecture3を見ました。

www.youtube.com

[Slides],[Note]

内容的にはこんな感じでした。

Policy evaluation
Policy iteration
Value iteration
プラス計算効率化のアイデア

Udacity DeepLearning Nanodegreeでやっていた内容ではあるんだけど、さらっとしか説明が無かったので、Policy iterationとValue iterationの違いを聞き直してよかった。
数式だけじゃなくて、図が出てくるのも理解の助けになりました。
V, π, ⟨S,A,P,R,γ⟩ とか出てきても怯まなくなってきた。笑

このあたり今まで無駄に時間かけて来ているので以下改善点:

Advanced なことから始めるのではなく、必要最低限な基本を確認してから取り組む
- いきなら DeepRLではなく、RLとその問題点の理解をするべきだった
集中して短期間で理解する
- 座学はノートを取りながら集中して理解しきる
英語だけで勉強することにこだわらずに、日本語の資料にも頼る

2018-08-19

RL勉強記録: Markov Decision Processes

ReinforcementLearning

UCL Course on RL by David Silver のLecture2はMarkov Decision Processesの定義について。いつも混同してしまっていたので、おさらいに良かった。

www.youtube.com

Slides

ノート https://www.evernote.com/l/ADy-blY1XhZMyYgwME5IaI5y2dhQ4piSqDs

2018-08-18

RL勉強記録

ReinforcementLearning

前々からReinforcemant Learning / Deep RL を勉強しようとしているけど、中途半端に止まってしまっていたので再開。

今まで勉強したことは

Udacity Deep Learning Nanodegree の中で、OpenGymのCartPoleとかFlying copterの制御の演習
UC Berkeley, CS294 のLecture 1-4まで。

今回勉強に使うのはDeepMindのDavid Silver 先生のコース

www.youtube.com Slides

目的としては

簡単なアプリケーションは作れるようにする
DeepRLに関する知識を頭にいれて、新しい論文を理解できるようにする
UC Berkeley CS294は新しい部分だけ後で取り組む

2018-03-23

My experience with Udacity Deep Learning Nanodegree

This week, I finished Udacity Deep Learning Nanodegree (DLND) which I started November 2017. I will write about my experience with the course. And I hope it will help those who are considering to take the course decide to enroll.

TL;DR

Udacity Deep Learning Nanodegree (DLND) is good for those who don't have DL background to understand and experience DL in reasonable period.
The projects in the course and the review system are very helpful for me to use DL technics rather than just knowing them.
It is worth paying the tuition fee $599 and spending 4 months (12 hrs/week) of your time.

Background

I am a software engineer with 7+ years of experience of new service development. so when I was considering to take this course, I didn't worry about coding that much. but I didn't have professional expertise in ML/DL.

I was really interested with deep learning, and personally built an object detection model with tensorflow's API as I try a new library or a programming language before taking the course.
But it was difficult for me to improve the model by tuning hyper parameter, customizing architecture without fundamental understanding of the algorithm behind the high level API.

I also wanted to know about what we can do with DL comprehensively.
So taking an online course seemed good choice to me, and it was.

About Udacity Deep Learning Nanodegree

DNLD is one of paid courses at Udacity, which has real projects with strict deadlines scheduled to finish all projects in 4 months. you have to satisfy requirements to pass strict reviews which are done manually.
Before enrolling it, I thought the tuition fee $599 was bit expensive. but it turned that it's worth paying it.

Curriculum

I will write about what I learned in the curriculum. it starts with introduction and the basics of neural networks and goes to advanced models.

Introduction

The first part is about the demos of DL applications, development environments (anaconda and jupyter notebook) and a math refresher relevant to DL.
You can preview this part for free.

Neural Networks

The 2nd part is about the basics of neural networks. you learn and implement simple neural network. you also learn how to use Keras and Tensorflow.

Project: Predicting Bike Sharing Data In this project, you implement a simple neural network with numpy. this was a good exercise to understand how forward path and back propagation work in neural network.

Convolutional Neural Networks

Convolutional Neural Networks (CNNs) have outperformed traditional ML models in computer vision tasks. It extracts features from images to understand high level concepts of them. You learn different CNN architectures, and transfer learning by implementing mini projects.

I enjoyed skin cancer detector project which wasn't easy to get good results. so I read related articles and advices posted in Slack to deepen my understanding.

Project: Dog Breed Classifier In this project, you implement a CNN with Keras to classify dog breed.

Recurrent Neural Networks

Recurrent Neural Networks (RNNs) act on sequences of inputs keeping context, such as sentences and voice sounds. Long short term memory networks (LSTMs) is a special kind of RNN and is capable of long term dependencies.

You implement both RNN and LSTM. And you also learn word2vec which is useful to represent words as vectors to get conceptual meanings of words.

Project: Generate TV Scripts In this project, you generate TV scripts from existing scripts dataset using RNN.

Generative Adversarially Networks

Generative Adversarial Networks (GANs) are unique networks compared to CNNs and RNNs. In a GAN, there is a generator network and a discriminator network. while the generator is trying to generate fake images, the discriminator is trying to discriminate if it's fake. Both networks get trained adversarially.

Project: Generate Faces In this project, you generate human face images using GANs. This network was complicated to build and train. so I checked related articles to get good results. it was actually harder for me to finish than the above projects.

Deep Reinforcement Learning

Reinforcement Learning is a type of machine learning where the agent learns how to maximize its performance by interacting the environment. Google's Deep Mind is famous in this area. they built game play agents which are remarkably better at each game than human.

You learn about reinforcement learning first and then learn how deep neural networks make it better. This is, for me, very big and complicated topic. and the goal is different from the other models in this course.

Project: Teach a quadcopter how to fly This is the last project of this course. you implement a quadcopter agent which learns how to fly using deep deterministic policy gradient (DDPG). This was the hardest project for me to complete. I read related articles as well as original papers. finally I was able to converge the training and my quadcopter successfully flied in a simulation environment as I expected.

Sum up

I couldn't write all of what I learned in the course since they are a lot. After I finished the course, or even while proceeding, I though that the course was well organized and I would have spend much more time if I learned them myself.
So I highly recommend this course people who want to learn what DL is and what DL is good at, how difficult to build models. the tuition fee was much cheaper than the saved time.

Even though I'm still not an expert, I have better understanding about DL. it's good for me as a software engineer because I will be able to choose DL as a solution for some problems.

Thank you for reading this article. I hope you enjoyed it. :)

2018-03-12

Word2vecでセマンティックジョブ検索

概要

前回は求人情報に含まれる単語をWord2vecで学習し、職種名やスキルの類似単語を取得できることを確認しました。

今回はそれらの類似単語を活用して、Elasticsearchでセマンティック検索をするデモを紹介します。

求人情報サイトでは、検索クエリの職種名やスキル名にマッチする案件を見つけることが難しく、これは似ているけど違う職種名やスキルの単語や表記揺れがたくさんあるからと考えられます。

セマンティック検索

今回のデモではElasticsearchのSynonym Token Filterを利用し、検索クエリに類似単語を追加しよりたくさんの関連した検索結果を見つけることを目指します。

同義語ファイルの作成

学習済みのword2vecのモデルから、各単語の類似語5つずつを取得し、同義語ファイルを作成します。

同義語ファイルの例

# synonym.txt
...
データサイエンティスト=>データ サイエンティスト,データ サイエンス,データ マイニング,データ アナリスト,機械 学習
機械学習=>機械 学習,自然 言語 処理,データ マイニング,マイニング,統計 学
...

Elasticsearch mapping

デモの為に、Synonym Token Filterを検索クエリに適用したインデックスと通常の日本語検索インデックスを用意します。

通常の検索インデックス

# jp_mapping.json
{
  "mappings": {
    "_doc": {
      "dynamic_templates": [{
        "texts": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "analyzer": "kuromoji",
            "search_analyzer": "kuromoji"
          }
        }
      }]
    }
  }
}

curl -XPUT -H "Content-Type: application/json" \ 
   localhost:9200/job_postings -d @jp_mapping.json

セマンティック検索インデックス

# jp_semantic_mapping.json
{
  "settings": {
    "index" : {
      "analysis" : {
        "analyzer" : {
          "synonym" : {
            "tokenizer": "whitespace",
            "filter" : ["synonym"]
          }
        },
        "filter" : {
          "synonym" : {
            "type" : "synonym",
            "synonyms_path" : "analysis/synonym.txt" # 用意した同義語定義ファイル　
          }
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "dynamic_templates": [{
        "texts": {
          "match_mapping_type": "string",
          "mapping": {
            "type": "text",
            "analyzer": "kuromoji",
            "search_analyzer": "synonym"
          }
        }
      }]
    }
  }
}

curl -XPUT -H "Content-Type: application/json" \ 
   localhost:9200/job_postings_semantic -d @jp_semantic_mapping.json

データインサート

用意した2種類のインデックスに、同じ求職情報セットをインサートしていきます。このデータは前回の記事でword2vecの学習に利用したものになります。

結果

それではKibanaで結果を見ていきましょう。

通常の検索	セマンティック検索

通常の検索結果では該当する単語にマッチする求職情報のみが表示せれています。一方でセマンティック検索の結果では類似語にマッチした求職情報も表示され6倍ほどの件数がヒットしました。

考察・まとめ

今回は簡単の為にSynonym token filterを利用し、通常の求人情報検索を改善することが出来ました。この手法は、採用側が候補者を検索する際にも有用だと考えられます。

ただ実際のプロダクトに組み込む為には、単語の類似度を考慮した独自の検索スコアを考えたり、それを検索のロジックに組み込んだりといった必要がありそうです。

最後まで読んで頂きありがとうございます。 word2vecの応用例として参考になれば幸いです。

参考

DiceTech ConceptualSearch Diceという海外の求人サイトでは、類似度を検索スコアに反映したより実践的なテクニックを紹介していましす。

jwata blog

勉強の記録や思ったことなど

RL勉強記録: Model-Free Control

RL勉強記録: Model-Free Prediction

RL勉強記録: Planning by Dynamic Programming

RL勉強記録: Markov Decision Processes

RL勉強記録

My experience with Udacity Deep Learning Nanodegree

TL;DR

Background

About Udacity Deep Learning Nanodegree

Curriculum

Introduction

Neural Networks

Convolutional Neural Networks

Recurrent Neural Networks

Generative Adversarially Networks

Deep Reinforcement Learning

Sum up

Word2vecでセマンティックジョブ検索

概要

セマンティック検索

同義語ファイルの作成

Elasticsearch mapping

データインサート

結果

考察・まとめ

参考