De-duped pronunciations in generated database hkutil

Identical pronunciations will now share an idea and be re-used by multiple forms. This has a negligible effect on database size, but it's useful for writing queries looking for words with the exact same pronunciations. This constitutes a minor database update, which we will call d1.2.
author: Star Rauchenberger <fefferburbia@gmail.com> 2022-11-30 17:58:44 -0500
committer: Star Rauchenberger <fefferburbia@gmail.com> 2022-11-30 17:58:44 -0500
commit: 6816abc1e89fd955524d7c772477d6483d12cbf9 (patch)
tree: b8707bdb5e180ae7be9d2ddf0ccfbeb539f36361 /generator/generator.cpp
parent: 38c17f093615a16a4b4ec6dc2b5d3edb5c1d3895 (diff)
download: verbly-hkutil.tar.gz
verbly-hkutil.tar.bz2
verbly-hkutil.zip
1 files changed, 9 insertions, 3 deletions
diff --git a/generator/generator.cpp b/generator/generator.cpp
index 0d073be..ad665a2 100644
--- a/generator/generator.cpp
+++ b/generator/generator.cpp

@@ -573,9 +573,15 @@ namespace verbly {
          }
          std::string phonemes = phoneme_data[2];
-          pronunciations_.emplace_back(phonemes);
+          if (pronunciationByPhonemes_.count(phonemes)) {
-          pronunciation& p = pronunciations_.back();
+            pronunciation& p = *pronunciationByPhonemes_[phonemes];
-          formByText_.at(canonical)->addPronunciation(p);
+            formByText_.at(canonical)->addPronunciation(p);
+          } else {
+            pronunciations_.emplace_back(phonemes);
+            pronunciation& p = pronunciations_.back();
+            pronunciationByPhonemes_[phonemes] = &p;
+            formByText_.at(canonical)->addPronunciation(p);
+          }
        }
      }
    }
author	Star Rauchenberger <fefferburbia@gmail.com>	2022-11-30 17:58:44 -0500
committer	Star Rauchenberger <fefferburbia@gmail.com>	2022-11-30 17:58:44 -0500
commit	6816abc1e89fd955524d7c772477d6483d12cbf9 (patch)
tree	b8707bdb5e180ae7be9d2ddf0ccfbeb539f36361 /generator/generator.cpp
parent	38c17f093615a16a4b4ec6dc2b5d3edb5c1d3895 (diff)
download	verbly-hkutil.tar.gz verbly-hkutil.tar.bz2 verbly-hkutil.zip