summary refs log tree commit diff stats
path: root/docs/new_object_structure.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/new_object_structure.md')
-rw-r--r--docs/new_object_structure.md72
1 files changed, 72 insertions, 0 deletions
diff --git a/docs/new_object_structure.md b/docs/new_object_structure.md new file mode 100644 index 0000000..9e25615 --- /dev/null +++ b/docs/new_object_structure.md
@@ -0,0 +1,72 @@
1# New object structure
2
3The rewrite of verbly uses a completely redesigned object structure that was designed to build off of the already-existing WordNet structure and add to it the data we are getting from other sources.
4
5## notion
6Something that can be expressed with words. fields: part of speech, wnid (WordNet ID, optional). nouns also have images field (number of images ImageNet has for this notion). has many words. related to each other through hypernymy, meronymy, synonymy, etc. parts of speech are:
7- noun {0}
8- adjective {1}
9- adverb {2}
10- verb {3}
11- preposition {4}
12
13relations are:
14- hypernymy (noun/noun and verb/verb)
15- instantiation (noun/noun)
16- meronymy (noun/noun)
17- variation (noun/adjective)
18- similarity (adjective/adjective) [symmetric]
19- entailment (verb/verb)
20- causality (verb/verb)
21
22notion also has a special relation "is a" between a preposition and a string group name
23
24## word
25An expression of a concept. belongs to a notion. belongs to a lemma. tag count (optional). adjectives also have position field. verbs optionally belong to groups. has several relations to itself:
26- antonymy (noun/noun, adjective/adjective, adverb/adverb, verb/verb) [symmetric]
27- specification (adjective/adjective, verb/verb)
28- pertainymy (noun/adjective)
29- mannernymy (adjective/adverb)
30- usage (noun/noun, noun/adjective, noun/adverb, noun/verb)
31- topicality (noun/noun, noun/adjective, noun/adverb, noun/verb)
32- regionality (noun/noun, noun/adjective, noun/adverb, noun/verb)
33
34adjective positions are:
35- predicate {0}
36- attributive {1}
37- postnominal {2}
38
39## lemma
40A lexical set that can be used to represent words. has many inflections (including the base inflection). has many words (that it represents). relations with itself:
41- derivation [not implemented yet]
42
43in implementation, this object has no fields, and thus it does not need a table. uniquely identifiable by base form. constructible from base form.
44
45## lemma/form
46The inflection relationship relates an uninflected lemma to its inflected forms. there can potentially be multiple ways to inflect a lemma, so the tuple (lemma_id, category) is not necessarily unique. field: type of inflection. ex: "care" is a singular (base) inflection of a noun, and a base inflection of a verb. "cares" is both a plural and an s form inflection of "care". the types of inflection are:
47- base {0}
48- plural (nouns) {1}
49- comparative (adjectives and adverbs) {2}
50- superlative (adjectives and adverbs) {3}
51- past tense (verbs) {4}
52- past participle (verbs) {5}
53- ing form (verbs) {6}
54- s form (verbs) {7}
55
56## form
57An inflection of a lemma. fields: text form, complexity (number of spaces plus one), proper (true if there is at least one capital letter, false otherwise). uniquely identifiable by text form. constructible from text form. has many and belongs to many pronunciations.
58
59## form/pronunciation
60One spelling of a word can have multiple pronunciations (whether by homography or speaker variation), but multiple words can also have the same pronunciation (homophony). the current data we have doesn't tell us which pronunciations go with which words, so we just associate all pronunciations of a form with the form.
61
62## pronunciation
63Fields: phonemes, rhyme phonemes, prerhyme, syllables, stress structure. has many and belongs to many forms.
64
65## frame
66A verb frame. belongs to a group. has many parts.
67
68## group (word/frame)
69A collection of verb frames. has many frames. has many words. this is not really an object per-se, more rather the name given to the cross join between sets of words and sets of frames. in implementation, this join has no fields, and thus it does not need a table.
70
71## part
72An ordered element of a verb frame. belongs to a frame. fields: index (position in the frame), and type. the tuple (frame_id, index) is unique. there are additional fields depending on the type of the frame. noun phrases have role and selrestrs. prepositions have prepositions and preposition_literality. literals have literal_value. in addition, noun phrases have synrestrs, which, in order to be queryable, are located in a separate table called "synrestrs".