summary refs log tree commit diff stats
path: root/docs/verb_frames.md
blob: 64f3f7e6975ad40d04ec2fb06131a89313eb6a71 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# Verb frames

Verbly's verb frame data comes from VerbNet, a database compiled by the University of Colorado Boulder Department of Linguistics. More information, including a download for v3.2, the version used in the canonical verbly datafile, can be found on [Martha Palmer's website](http://verbs.colorado.edu/~mpalmer/projects/verbnet.html).

The downloadable data for VerbNet v3.2 has a lot of quirks and inadequacies that make it unsuitable for natural language generation. In particular, it makes no distinction between noun phrases and adjective phrases, so figuring out how to fill in that particular blank is in most cases impossible. In order to make the data more usable, I have gone through the data and sanitized it in a lot of places. A patch file, applicable to a clean VerbNet v3.2 download, [can be found in the repository](https://code.fourisland.com/verbly/tree/generator/vn-3.2.diff). This patch will likely continue to be updated as verbly is developed.

## Syntactic Restrictions
The data from VerbNet allows for a set of syntactic restrictions to follow either AND logic or OR logic; however, OR logic is never used, so we shall be ignoring it in our implementation. Additionally, syntactic restrictions are listed as being additive or subtractive; however, each distinct syntactic restriction always appears positively or always appears negatively. Therefore, we can remove the additive/subtractive modifier from our implementation, and change the meaning of the subtractive restrictions to mean the negation of what they "should" mean. These syntactic restrictions frequently indicate that the noun phrase is not actually a noun phrase and should be treated differently, so these are important to watch out for.

**np_ppart**  
As far as I can tell, it has no purpose. It always appears before an ADJP or ADJ though.

**be_sc_ing, ac_ing, sc_ing, np_omit_ing**  
Used for gerund phrases.

**oc_ing**  
Used for gerund phrases. Always preceded by a noun or an objective pronoun; most of the time, the two are separated by a preposition, but not always.

**poss_ing, possing, pos_ing**  
Used for a possessive (whether it be a noun with an apostrophe s or a possessive pronoun) followed by a gerund phrase.

**acc_ing**  
Used for a noun (or an objective pronoun) followed by a participle phrase.

**genitive**  
Used to indicate that the noun phrase should be possessive.

**that_comp**  
Used for the word "that" followed by an independent clause in the simple past perfect tense.

**tensed_that**  
Always appears negatively and alongside a that_comp. Use unknown.

**wh_comp**  
Used for a phrase starting with the word "whether." The data using this restriction is a bit muddy, so this is not a perfect description. It will likely be cleaned up in a future release.

**what_extract**  
Used for a phrase starting with the word "what." The data using this restriction is a bit muddy, so this is not a perfect description. It will likely be cleaned up in a future release.

**how_extract**  
Used for a phrase starting with the word "how." The data using this restriction is a bit muddy, so this is not a perfect description. It will likely be cleaned up in a future release.

**sc_to_inf, ac_to_inf, vc_to_inf, rs_to_inf**  
Used for infinitive phrases.

**oc_to_inf**  
Used for infinitive phrases. Always immediately preceded by a noun or an objective pronoun.

**oc_bare_inf**  
Used for infinitive phrases with bare infinitives. Always immediately preceded by a noun or an objective pronoun.

**wh_inf**  
Used for the word "how" (or sometimes, "when" or "whether"), followed by an infinitive phrase. Which starting word is used is frame-dependent. The data using this restriction is a bit muddy, so this is not a perfect description. It will likely be cleaned up in a future release.

**what_inf**  
Used for the word "what" followed by an infinitive phrase. One frame in empathize-88.2 erroneously uses it to indicate a phrase of the form "what they want." This will likely be cleaned up in a future release.

**wheth_inf**  
Used for the word "whether" followed by an infinitive phrase.

**for_comp**  
Used to indicate the following format: the word "for", followed by a noun or an objective pronoun, followed by an infinitive phrase.

**quotation**  
Used to indicate a quotation.

**plural**  
Used to indicate that the noun phrase should be plural.

**definite**  
Always used negatively to indicate that a noun phrase should not be definite.

**adv_loc**  
Used to indicate either the word "here" or "there." In one case (throw-17.1), the word "away" is acceptable too. This will likely be cleaned up in a future release.

**refl**  
Used to indicate the usage of a reflexive pronoun.

**adjp**  
Used to indicate an adjective.

**sentential**  
Use unknown.

## Selectional Restrictions
Selectional restrictions are used to semantically filter nouns and prepositions. The namespaces for nouns and prepositions are separate.

Selectional restrictions for nouns are usually found in the role descriptions for each verb group; however they can rarely also be found in a specific NP element. Usually, subgroups inherit their roles from their parents, and NP elements inherit their selectional restrictions from the role it is assigned. When an NP element defines selectional restriction despite the role the element is assigned already having restrictions, or when a role is given restrictions in a subgroup when it already has restrictions in the parent, the parent restrictions are ignored and the child's restrictions are used.

In the original data, restrictions for nouns and roles can be defined using AND logic or OR logic. In a few rare cases, two AND clauses are ORed together. Additionally, restrictions can be either positive or negative. In our implementation, we have flattened the selectional restriction trees in order to make them easier to query and parse. Selectional restrictions are implemented as sets of positive restrictions ORed together. In order to do this, some changes had to be made to the VerbNet data. Specifically, 7 new restrictions were created to represent complex cases in the original data, which were either AND clauses or negative restrictions. The new restrictions are:

- **concrete_inanimate**: concrete && !animate
- **group**: concrete && plural
- **inanimate**: !animate
- **non_region_location**: location && !region
- **non_solid_food**: comestible && !solid
- **slinky**: nonrigid && elongated
- **solid_food**: comestible && solid

For prepositions, selectional restrictions are always positive. Usually at most one restriction is used, but in the rare event that more are present (6 cases out of 146), they are always applied using OR logic. The restrictions used for prepositions are the names of the preposition groups defined in [prepositions.txt](https://code.fourisland.com/verbly/tree/generator/prepositions.txt), which makes querying for applicable prepositions easy.