summary refs log tree commit diff stats
path: root/lib/word.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fixed uncopyable word classStar Rauchenberger2022-12-141-2/+7
|
* More hkutil refactoringKelly Rauchenberger2018-09-271-74/+11
| | | | | | | | All database access goes through hatkirby::database now. verbly::token, verbly::statement::condition, and verbly::part have been converted to use mpark::variant now. verbly::binding has been deleted, and replaced with a mpark::variant typedef in statement.h. This means that the only remaining tagged union class is verbly::generator::part. refs #5
* Added mask filters and fixed the synonym queryKelly Rauchenberger2017-12-211-10/+14
| | | | refs #1
* Added word::synonyms join field (BAD)Kelly Rauchenberger2017-02-161-0/+20
| | | | | | Note that this is not a great implementation; the filter generated is mergable with unrelated filters and may cause results that are misleading.
* Renamed object validity checksKelly Rauchenberger2017-02-101-1/+1
| | | | | The bool conversion operator was unfortunately Very Confusing so I've just renamed the methods to isValid.
* Made pronunciation::rhymes join dynamicKelly Rauchenberger2017-02-061-26/+39
| | | | | | | | | | | | | | | | | | | | | | | | | This involved adding a new type of filter; one that compares (currently only equality and inequality) a field with another field located in an enclosing join context. In the process, it was discovered that simplifying the lemma::forms join field earlier actually made some queries return inaccurate results because the inflection of the form was being ignored and anything in the lemma would be used because of the inner join. Because the existing condition join did not allow for the condition field to be on the from side of the join, two things were done: a condition version of joinThrough was made, and lemma was finally eliminated as a top-level object, replaced instead with a condition join between word and form through lemmas_forms. Queries are also now grouped by the first select field (assumed to be the primary ID) of the top table, in order to eliminate duplicates created by inner joins, so that there is a uniform distribution between results for random queries. Created a database index on pronunciations(rhyme) which decreases query time for rhyming filters. The new database version is backwards-compatible because no data or structure changed.
* Renamed object join fields to prevent conflicts with class namesKelly Rauchenberger2017-02-031-3/+3
| | | | This was not a problem with clang but it caused compilation errors with gcc.
* Restructured verb frame schema to be more queryableKelly Rauchenberger2017-01-281-5/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Groups are much less significant now, and they no longer have a database table, nor are they considered a top level object anymore. Instead of containing their own role data, that data is folded into the frames so that it's easier to query; as a result, each group has its own copy of the frames that it contains. Additionally, parts are considered top level objects now, and you can query for frames based on attributes of their indexed parts. Synrestrs are also contained in their own table now, so that parts can be filtered against their synrestrs; they are however not considered top level objects. Created a new type of field, the "join where" or "condition join" field, which is a normal join field that has a built in condition on a specified field. This is used to allow creating multiple distinct join fields from one object to another. This is required for the lemma::form and frame::part joins, because filters for forms of separate inflections should not be coalesced; similarly, filters on differently indexed frame parts should not be coalesced. Queries can now be ordered, ascending or descending, by a field, in addition to randomly as before. This is necessary for accessing the parts of a verb frame in the correct order, but may be useful to an end user as well. Fixed a bug with statement generation in that condition groups were not being surrounded in parentheses, which made mixing OR groups and AND groups generate inaccurate statements. This has been fixed; additionally, parentheses are not placed around the top level condition, and nested condition groups with the same logic type are coalesced, to make query strings as easy to read as possible. Also simplified the form::lemma field; it no longer conditions on the inflection of the form like the lemma::form field does. Also added a debug flag to statement::getQueryString that makes it return a query string with all of the bindings filled in, for debug use only.
* Whitespace changesKelly Rauchenberger2017-01-241-28/+28
|
* Added word::getGroupKelly Rauchenberger2017-01-241-2/+22
|
* Started structural rewriteKelly Rauchenberger2017-01-161-34/+86
| | | | | | | | | | | | | | | | | | | | The new object structure was designed to build on the existing WordNet structure, while also adding in all of the data that we get from other sources. More information about this can be found on the project wiki. The generator has already been completely rewritten to generate a datafile that uses the new structure. In addition, a number of indexes are created, which does double the size of the datafile, but also allows for much faster lookups. Finally, the new generator is written modularly and is a lot more readable than the old one. The verbly interface to the new object structure has mostly been completed, but has not been tested fully. There is a completely new search API which utilizes a lot of operator overloading; documentation on how to use it should go up at some point. Token processing and verb frames are currently unimplemented. Source for these have been left in the repository for now.
* Fixed perfect rhymingKelly Rauchenberger2016-04-171-19/+22
| | | | | | Rhyme detection now ensures that any rhymes it finds are perfect rhymes and not identical rhymes. Rhyme detection is also now a lot faster because additional information is stored in the datafile. Also fixed a bug in the query interface (and the generator) that could cause incorrect queries to be executed.
* Added verb framesKelly Rauchenberger2016-03-241-1/+10
| | | | | | | | | | In addition: - Added prepositions. - Rewrote a lot of the query interface. It now, for a lot of relationships, supports nested AND, OR, and NOT logic. - Rewrote the token class. It is now a union-like class instead of being polymorphic, which means smart pointers are no longer necessary. - Querying with regards to word derivation has been temporarily removed. - Sentinel values are now supported for all word types. - The VerbNet data retrieved from http://verbs.colorado.edu/~mpalmer/projects/verbnet/downloads.html was found to not be perfectly satisfactory in some regards, especially regarding adjective phrases. A patch file is now included in the repository describing the changes made to the VerbNet v3.2 download for the canonical verbly datafile.
* Added fallback to vowel sound detection when no entry exists in CMUDICTKelly Rauchenberger2016-03-201-3/+11
|
* Added vowel sound identificationKelly Rauchenberger2016-03-191-0/+8
|
* Added word derivational relationships (kind of eh at the moment) and moved ↵Kelly Rauchenberger2016-03-161-0/+32
verbly into its own directory