verbly - Natural language generation library

	Commit message (Collapse)	Author	Age	Files	Lines
*	Updated hkutil	Star Rauchenberger	2023-02-17	1	-0/+0
\|
*	Added optional database query timeout	Star Rauchenberger	2023-02-16	3	-0/+7
\|
*	Added antogram and antophone querying	Star Rauchenberger	2023-02-15	6	-7/+19
\|
*	Fixed infinite recursion in token class	Star Rauchenberger	2023-02-12	1	-1/+1
\|
*	Fixed some issues that stricter compilers picked up	Star Rauchenberger	2023-02-12	3	-5/+4
\|
*	Fixed frequencies using mpark variant	Star Rauchenberger	2023-02-03	1	-2/+2
\|
*	Added word frequency information	Star Rauchenberger	2023-02-03	8	-16/+113
\|
*	Fixed merography/merophony missing the end of the original string	Star Rauchenberger	2023-02-03	1	-2/+2
\|
*	Converted verbly::generator::part to std::variant	Star Rauchenberger	2022-12-14	2	-320/+73
\| \| \| \|	fixes #5
*	Migrate from mpark::variant to std::variant	Star Rauchenberger	2022-12-14	16	-173/+170
\|
*	Fixed uncopyable word class	Star Rauchenberger	2022-12-14	2	-3/+8
\|
*	Generator now splits ImageNet list into per-notion files	Star Rauchenberger	2022-12-09	3	-12/+23
\|
*	Merography and merophony appear to have been backwards	Star Rauchenberger	2022-12-09	3	-5/+5
\|
*	Added a bunch of stuff for making LINGO puzzles	Star Rauchenberger	2022-12-08	15	-17/+373
\|
*	De-duped pronunciations in generated database hkutil	Star Rauchenberger	2022-11-30	2	-3/+10
\| \| \| \| \| \|	Identical pronunciations will now share an idea and be re-used by multiple forms. This has a negligible effect on database size, but it's useful for writing queries looking for words with the exact same pronunciations. This constitutes a minor database update, which we will call d1.2.
*	More hkutil refactoring	Kelly Rauchenberger	2018-09-27	23	-1920/+866
\| \| \| \| \| \| \| \|	All database access goes through hatkirby::database now. verbly::token, verbly::statement::condition, and verbly::part have been converted to use mpark::variant now. verbly::binding has been deleted, and replaced with a mpark::variant typedef in statement.h. This means that the only remaining tagged union class is verbly::generator::part. refs #5
*	Removed unnecessary ROWIDs from database schema	Kelly Rauchenberger	2018-09-26	4	-80/+92
\| \| \| \| \| \| \| \|	The generator also now sorts and uniq's the WordNet files for antonymy, classification, and pertainymy/mannernymy, because those files contained duplicate rows, and the join tables without ROWIDs now enforce a uniqueness constraint. This constitutes a minor database update -- the new database is compatible with d1.0, but is ~12MB smaller. refs #6
*	Re-created VerbNet 3.2 patch	Kelly Rauchenberger	2018-09-26	2	-815/+1609
\| \| \| \|	The previous file was not formatted in a way that was easy for patch to use. Also, the file has now been renamed to indicate that it is for VerbNet 3.2.
*	Replaced some split/implode uses with hkutil	Kelly Rauchenberger	2018-08-10	6	-18/+16
\|
*	Converted verbly::filter to use a variant object	Kelly Rauchenberger	2018-04-01	4	-756/+433
\|
*	Converted asserts in generator to exceptions	Kelly Rauchenberger	2018-03-31	6	-18/+26
\|
*	Migrated generator to hkutil	Kelly Rauchenberger	2018-03-31	20	-262/+331
\|
*	Started migrating to hkutil (does not build)	Kelly Rauchenberger	2018-03-30	11	-659/+105
\|
*	Added mask filters and fixed the synonym query	Kelly Rauchenberger	2017-12-21	4	-11/+193
\| \| \| \|	refs #1
*	Created database versioning system d1.0	Kelly Rauchenberger	2017-11-08	6	-0/+129
\| \| \| \| \| \| \| \| \| \| \|	Also added an ANALYZE statement to the end of the datafile generation process. This generates information that allows sqlite to sometimes come up with a better query plan, and in many cases can significant speed up queries. This constitutes a minor database update, but because this is the first version that uses the database versioning system, older versions are essentially incompatible. refs #2
*	Added quote token	Kelly Rauchenberger	2017-11-08	2	-7/+39
\| \| \| \|	This token wraps the inner token in two provided delimiters.
*	Renamed definiteArticle token to indefiniteArticle	Kelly Rauchenberger	2017-11-08	2	-14/+16
\|
*	Fixed bug with token title case transform	Kelly Rauchenberger	2017-10-28	1	-2/+30
\| \| \| \| \|	Word tokens and literal tokens that contained more than one word would only capitalize the first word; this has been fixed.
*	Added length field to form table	Kelly Rauchenberger	2017-10-16	5	-4/+27
\| \| \| \|	This commit contains a database update.
*	Added more casing options to tokens	Kelly Rauchenberger	2017-02-24	2	-21/+80
\|
*	Added method to check if a token is empty	Kelly Rauchenberger	2017-02-19	1	-0/+5
\|
*	Added transform tokens	Kelly Rauchenberger	2017-02-16	2	-5/+244
\|
*	Added word::synonyms join field (BAD)	Kelly Rauchenberger	2017-02-16	2	-0/+32
\| \| \| \| \| \|	Note that this is not a great implementation; the filter generated is mergable with unrelated filters and may cause results that are misleading.
*	Fixed weird filter normalization crash	Kelly Rauchenberger	2017-02-16	1	-13/+13
\|
*	Tweaked database indexes	Kelly Rauchenberger	2017-02-13	1	-3/+3
\| \| \| \| \| \| \| \|	`rhymes_with` now also contains `prerhyme` so that rhyming joins can be convering. `notions_lemmas` and `lemmas_notions` have been created so as to faciliate "jumping" over `words` when it's only needed as a many-to-many through table. Because `notion_words` and `lemma_words` are prefixes of these new indexes, they have been removed.
*	Expanded some indexes	Kelly Rauchenberger	2017-02-11	1	-34/+34
\| \| \| \|	These modifications can make some queries run significantly faster.
*	Fixed statement generation involving two CTEs for the same table	Kelly Rauchenberger	2017-02-11	1	-2/+10
\|
*	Added negative filter conversions to objects	Kelly Rauchenberger	2017-02-10	5	-40/+103
\|
*	Renamed object validity checks	Kelly Rauchenberger	2017-02-10	7	-7/+7
\| \| \| \| \|	The bool conversion operator was unfortunately Very Confusing so I've just renamed the methods to isValid.
*	Made pronunciation::rhymes join dynamic	Kelly Rauchenberger	2017-02-06	24	-365/+574
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This involved adding a new type of filter; one that compares (currently only equality and inequality) a field with another field located in an enclosing join context. In the process, it was discovered that simplifying the lemma::forms join field earlier actually made some queries return inaccurate results because the inflection of the form was being ignored and anything in the lemma would be used because of the inner join. Because the existing condition join did not allow for the condition field to be on the from side of the join, two things were done: a condition version of joinThrough was made, and lemma was finally eliminated as a top-level object, replaced instead with a condition join between word and form through lemmas_forms. Queries are also now grouped by the first select field (assumed to be the primary ID) of the top table, in order to eliminate duplicates created by inner joins, so that there is a uniform distribution between results for random queries. Created a database index on pronunciations(rhyme) which decreases query time for rhyming filters. The new database version is backwards-compatible because no data or structure changed.
*	Fixed error with notion::words join	Kelly Rauchenberger	2017-02-05	1	-1/+1
\|
*	Added some missing includes	Kelly Rauchenberger	2017-02-05	2	-0/+2
\|
*	Flattened selrestrs	Kelly Rauchenberger	2017-02-05	20	-12732/+468
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, selrestrs are, instead of logically being a tree of positive/negative restrictions that are ANDed/ORed together, they are a flat set of positive restrictions that are ORed together. They are stored as strings in a table called selrestrs, just like synrestrs, which makes them a lot more queryable now as well. This change required some changes to the VerbNet data, because we needed to consolidate any ANDed clauses into single selrestrs, as well as convert any negative selrestrs into positive ones. The changes made are detailed on the wiki. Preposition choices are now encoded as comma-separated lists instead of using JSON. This change, along with the selrestrs one, allows us to remove verbly's dependency on nlohmann::json.
*	Renamed object join fields to prevent conflicts with class names	Kelly Rauchenberger	2017-02-03	15	-52/+52
\| \| \| \|	This was not a problem with clang but it caused compilation errors with gcc.
*	Fixed reference to local address bug with gcc	Kelly Rauchenberger	2017-02-03	1	-3/+6
\| \| \| \|	Using the ternary operator appeared to cause a reference to local address bug with field::getConditionField. Replacing the ternary operator with an if statement fixes the problem. This problem did not occur with clang.
*	Fixed statement generation re negative joins without a through table	Kelly Rauchenberger	2017-02-03	1	-1/+1
\| \| \| \| \| \| \| \|	Previously, a simple negative join directly to an object rather than through another table would error because the statement generator would attempt to instantiate a CTE on the field's through table, which is undefined. Now, the proper table is used regardless of whether a through table is defined.
*	Added enum inequality matches to field	Kelly Rauchenberger	2017-02-03	3	-1/+25
\|
*	Restructured verb frame schema to be more queryable	Kelly Rauchenberger	2017-01-28	35	-649/+885
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Groups are much less significant now, and they no longer have a database table, nor are they considered a top level object anymore. Instead of containing their own role data, that data is folded into the frames so that it's easier to query; as a result, each group has its own copy of the frames that it contains. Additionally, parts are considered top level objects now, and you can query for frames based on attributes of their indexed parts. Synrestrs are also contained in their own table now, so that parts can be filtered against their synrestrs; they are however not considered top level objects. Created a new type of field, the "join where" or "condition join" field, which is a normal join field that has a built in condition on a specified field. This is used to allow creating multiple distinct join fields from one object to another. This is required for the lemma::form and frame::part joins, because filters for forms of separate inflections should not be coalesced; similarly, filters on differently indexed frame parts should not be coalesced. Queries can now be ordered, ascending or descending, by a field, in addition to randomly as before. This is necessary for accessing the parts of a verb frame in the correct order, but may be useful to an end user as well. Fixed a bug with statement generation in that condition groups were not being surrounded in parentheses, which made mixing OR groups and AND groups generate inaccurate statements. This has been fixed; additionally, parentheses are not placed around the top level condition, and nested condition groups with the same logic type are coalesced, to make query strings as easy to read as possible. Also simplified the form::lemma field; it no longer conditions on the inflection of the form like the lemma::form field does. Also added a debug flag to statement::getQueryString that makes it return a query string with all of the bindings filled in, for debug use only.
*	Removed some debug output	Kelly Rauchenberger	2017-01-24	3	-4/+0
\|
*	Whitespace changes	Kelly Rauchenberger	2017-01-24	32	-801/+801
\|