verbly - Natural language generation library

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Added form::startsWithVowelSound convenience method	Kelly Rauchenberger	2017-01-23	2	-51/+77
\|
*	Rewrote tokens	Kelly Rauchenberger	2017-01-23	4	-650/+405
\|
*	Added verb frame parsing	Kelly Rauchenberger	2017-01-23	9	-260/+643
\|
*	Added filter compacting	Kelly Rauchenberger	2017-01-23	3	-42/+76
\| \| \| \| \|	Before statement compilation, empty filters are removed from group filters, and childless group filters become empty filters.
*	Fixed normalization of negative join filters	Kelly Rauchenberger	2017-01-23	1	-183/+192
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, negative join filters were folded in with positive joins by AND/ORing them together and negating the negative joins. Checking for the existence of something that doesn't match a condition is different from checking for the non-existence of something that does match a condition, so now normalization considers positive and negative join filters to be distinct classes of filters and does not fold them together. Also made some whitespace changes.
*	Fixed nullity/non-nullity filters on join fields	Kelly Rauchenberger	2017-01-23	1	-2/+12
\|
*	Whitespace changes	Kelly Rauchenberger	2017-01-23	1	-19/+19
\|
*	Fixed generator ignoring multiple inflection variants	Kelly Rauchenberger	2017-01-22	1	-257/+268
\| \| \| \| \| \|	Previously, the generator would recognize at most one form per inflection per lemma; now, the generator adds all variants in AGID to the database.
*	Removed underscores in two-word literal prepositions in verb frames	Kelly Rauchenberger	2017-01-22	1	-2/+11
\|
*	Fixed statement generation involving negative subqueries	Kelly Rauchenberger	2017-01-21	2	-57/+234
\| \| \| \| \| \| \| \| \| \|	Previously, we generated negative subqueries by integrating them into the main statement normally, and then making the connecting join be a LEFT JOIN instead of an INNER JOIN, and by adding a condition that the join column be NULL. The problem with this is that if the top table of the subquery joins against any other table (which join throughs always do), then no rows will be returned. This was solved by putting the subquery into a CTE and then LEFT JOINing as before with the CTE.
*	Fixed statement generation involving nullity/non-nullity	Kelly Rauchenberger	2017-01-21	1	-1/+20
\|
*	Moved some generator classes into the main namespace	Kelly Rauchenberger	2017-01-21	16	-431/+479
\|
*	Fixed instances of prep "off of" in VerbNet data	Kelly Rauchenberger	2017-01-21	1	-0/+13
\|
*	Started structural rewrite	Kelly Rauchenberger	2017-01-16	78	-8696/+8971
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new object structure was designed to build on the existing WordNet structure, while also adding in all of the data that we get from other sources. More information about this can be found on the project wiki. The generator has already been completely rewritten to generate a datafile that uses the new structure. In addition, a number of indexes are created, which does double the size of the datafile, but also allows for much faster lookups. Finally, the new generator is written modularly and is a lot more readable than the old one. The verbly interface to the new object structure has mostly been completed, but has not been tested fully. There is a completely new search API which utilizes a lot of operator overloading; documentation on how to use it should go up at some point. Token processing and verb frames are currently unimplemented. Source for these have been left in the repository for now.
*	Updated to nlohmann/json 2.0.9	Kelly Rauchenberger	2016-12-28	1	-291/+1698
\|
*	Removed nlohmann/json submodule	Kelly Rauchenberger	2016-11-27	4	-4/+10795
\| \| \| \|	The submodule contained around 73MB of benchmarks and tests that are not necessary for inclusion in this project. Thus, the submodule has been removed, and the 2.0.7 release of nlohmann/json has been added to the repository.
*	Added pronunciation syllable count and stress structure	Kelly Rauchenberger	2016-05-30	11	-12/+330
\| \| \| \|	Also updated CMakeLists.txt such that including projects don't have to include sqlite3.
*	Added debug print method for token type	Kelly Rauchenberger	2016-05-16	2	-0/+17
\|
*	Fixed token extra functionality in copying	Kelly Rauchenberger	2016-05-16	1	-0/+2
\|
*	Added rhymes_with predicate based on rhymes rather than words	Kelly Rauchenberger	2016-05-16	8	-0/+32
\|
*	Implemented some accidentally unimplemented adjective_query predicates	Kelly Rauchenberger	2016-05-10	1	-0/+21
\|
*	Merge branch 'master' of https://github.com/hatkirby/verbly	Kelly Rauchenberger	2016-05-02	18	-366/+980
\|\
\| *	Fixed problem with words containing certain characters	Kelly Rauchenberger	2016-04-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The generator previously had a problem wherein it would ignore WordNet lemmas containing certain non-alpha characters (hyphens, slashes, numbers, apostrophes). In addition to these words not being included in the generated datafile, it had the side effect of causing relationships involving the ignored words (e.g. hypernymy, synonymy, etc) to instead be related to the word with id 0, which did not exist. This rarely caused a failure with direct queries; but it caused hierarchal queries (most notably full hyponymy, which is where the error was noticed) to potentially permit far more lemmas than they should have because a very large number of words could be transitively reached through the sentinel word id 0. The generator has been fixed to not ignore the words containing special characters, which removed the word id 0 from most relationships and therefore fixed hierarchal queries. The only remaining word id 0s are as a synonym of "free-flying" (synset 301380571) and as an anti-mannernym of "aerially" (synset 400202718). This is because the WordNet data is malformed in the definitions of two words: "aerial" (synset 301380267) and "marine" (synset 301380721). The generator ignored those two lines, causing the described error, although the latter word being ignored did not cause any other errors. The bug was discovered when the Twitter bot difference (https://github.com/hatkirby/difference) generated a tweet (https://twitter.com/differencebot/status/722084219925700613) as a result of returning the noun "tearaway" in a full hyponym query of "artifact".
\| *	Fixed perfect rhyming	Kelly Rauchenberger	2016-04-17	15	-75/+442
\| \| \| \| \| \| \| \| \| \| \| \|	Rhyme detection now ensures that any rhymes it finds are perfect rhymes and not identical rhymes. Rhyme detection is also now a lot faster because additional information is stored in the datafile. Also fixed a bug in the query interface (and the generator) that could cause incorrect queries to be executed.
\| *	Added support for ImageNet and fixed bug with query interface	Kelly Rauchenberger	2016-04-15	13	-309/+551
\| \| \| \| \| \| \| \| \| \| \| \|	Datafile change: nouns now know how many images are associated with them on ImageNet, and also have their WordNet synset ID saved so that you can query for images of that noun via the ImageNet API. So far, verbly only exposes the ImageNet API URL, and doesn't actually interact with it itself. This may be changed in the future. The query interface had a huge issue in which multiple instances of the same condition would overwrite each other. This has been fixed.
* \|	Added "requires plural form" noun query predicate	Kelly Rauchenberger	2016-05-02	2	-0/+16
\|/
*	Added sqlite3 version restriction (for WITH clauses)	Kelly Rauchenberger	2016-03-29	1	-2/+2
\|
*	Added prefix/suffix search, and word complexity search for nouns, ↵	Kelly Rauchenberger	2016-03-27	10	-12/+452
\| \| \| \| \| \|	adjectives, and adverbs Word complexity refers to the number of words in a noun, adjective, or adverb.
*	Fixed bug with filters	Kelly Rauchenberger	2016-03-26	1	-2/+4
\|
*	Added full hierarchy search for meronymy	Kelly Rauchenberger	2016-03-26	2	-1/+313
\|
*	Added verb frames	Kelly Rauchenberger	2016-03-24	40	-2853/+6363
\| \| \| \| \| \| \| \| \| \|	In addition: - Added prepositions. - Rewrote a lot of the query interface. It now, for a lot of relationships, supports nested AND, OR, and NOT logic. - Rewrote the token class. It is now a union-like class instead of being polymorphic, which means smart pointers are no longer necessary. - Querying with regards to word derivation has been temporarily removed. - Sentinel values are now supported for all word types. - The VerbNet data retrieved from http://verbs.colorado.edu/~mpalmer/projects/verbnet/downloads.html was found to not be perfectly satisfactory in some regards, especially regarding adjective phrases. A patch file is now included in the repository describing the changes made to the VerbNet v3.2 download for the canonical verbly datafile.
*	Added fallback to vowel sound detection when no entry exists in CMUDICT	Kelly Rauchenberger	2016-03-20	1	-3/+11
\|
*	Fixed bug in generator regarding proper noun detection	Kelly Rauchenberger	2016-03-20	1	-2/+1
\|
*	Nouns with any uppercase letters are now considered proper	Kelly Rauchenberger	2016-03-19	4	-6/+29
\|
*	Added license information for AGID and CMUDICT	Kelly Rauchenberger	2016-03-19	1	-1/+72
\|
*	Added vowel sound identification	Kelly Rauchenberger	2016-03-19	2	-0/+9
\|
*	Renamed some class members that shared names with classes	Kelly Rauchenberger	2016-03-17	2	-5/+6
\|
*	Added word derivational relationships (kind of eh at the moment) and moved ↵	Kelly Rauchenberger	2016-03-16	23	-0/+2348
\| \| \| \|	verbly into its own directory
*	Added more inflections, word relationships, and pronunciations	Kelly Rauchenberger	2016-03-16	16	-346/+2740
\| \| \| \|	Nouns, adjectives, and adverbs now have inflected forms. A large number of WordNet word relationships (all noun-noun relationships, plus synonymy and antonymy for all word types except verbs) have been added. Additionally, CMUDICT is now being used to store word pronunciations for rhyming purposes. Verbly is now also a compiled library rather than being header-only due to the complexity of the query interface.
*	Started implementing verbly data generator	Kelly Rauchenberger	2016-03-10	5	-18/+106
\| \| \| \| \| \|	Currently, the generator: - Uses AGID to create entries for verb words and their inflections - Uses WordNet to create entries for adjective, adverb, and noun senses
*	Started verbly rewrite	Kelly Rauchenberger	2016-03-09	6	-0/+670
	verbly is intended to be a general use natural language generation library. Here, I'm using it to simply generate random verbs or adjectives. A schema for the sqlite database is provided, and for testing I manually added data. A generator program is being written that will generate a database from WordNet, VerbNet, PropBank, and AGID data.