about summary refs log tree commit diff stats
path: root/kgramstats.h
diff options
context:
space:
mode:
authorKelly Rauchenberger <fefferburbia@gmail.com>2016-02-01 09:30:04 -0500
committerKelly Rauchenberger <fefferburbia@gmail.com>2016-02-01 09:30:04 -0500
commit617155fe562652c859a380d85cc5710783d79448 (patch)
treef5eee89b0fa4b3c9dfe7187ca78916a71b59045e /kgramstats.h
parentb316e309559d7176af6cf0bb7dcd6dbaa83c01cd (diff)
downloadrawr-ebooks-617155fe562652c859a380d85cc5710783d79448.tar.gz
rawr-ebooks-617155fe562652c859a380d85cc5710783d79448.tar.bz2
rawr-ebooks-617155fe562652c859a380d85cc5710783d79448.zip
Added emoji freevar
Strings of emojis are tokenized separately from anything else, and added to an emoticon freevar, which is mixed in with regular emoticons like :P. This breaks old-style freevars like $name$ and $noun$ so some legacy support for compatibility is left in but eventually $name$ should be made into an actual new freevar. Emoji data is from gemoji (https://github.com/github/gemoji).
Diffstat (limited to 'kgramstats.h')
-rw-r--r--kgramstats.h5
1 files changed, 4 insertions, 1 deletions
diff --git a/kgramstats.h b/kgramstats.h index a97d7bf..4acde65 100644 --- a/kgramstats.h +++ b/kgramstats.h
@@ -112,8 +112,11 @@ private:
112 112
113 int maxK; 113 int maxK;
114 std::map<kgram, std::map<int, token_data> > stats; 114 std::map<kgram, std::map<int, token_data> > stats;
115 word hashtags {"#hashtag"}; 115
116 // Words
116 std::map<std::string, word> words; 117 std::map<std::string, word> words;
118 word hashtags {"#hashtag"};
119 word emoticons {"👌"};
117}; 120};
118 121
119void printKgram(kgram k); 122void printKgram(kgram k);