Corrections
Make corrections to your data.
corrections_from_dict(corrections_dict)
¶
Create a list of Correction objects from a simpler config for corrections using a Dict representation mapping keys to either the label to convert to or a tuple of (from_label, to_label) pairings or (List[from_labels], to_label) pairings if you want to convert as subset of labels at a time
Parameters:
Name | Type | Description | Default |
---|---|---|---|
corrections_dict |
Dict[str, Any]
|
Corrections formatted dict e.g. { "united states": "GPE", "London": (["LOC"], "GPE") } |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the format of the dict |
Returns:
Type | Description |
---|---|
List[Correction]
|
|
Source code in recon/corrections.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
fix_annotations(example, corrections, case_sensitive=False, dryrun=False)
¶
Fix annotations in a copy of List[Example] data.
This function will NOT add annotations to your data. It will only remove erroneous annotations and fix the labels for specific spans.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
example |
Example
|
Input Example |
required |
corrections |
Dict[str, str]
|
Dictionary of corrections mapping entity text to a new label. If the value is set to None, the annotation will be removed |
required |
case_sensitive |
bool
|
Consider case of text for each correction |
False
|
dryrun |
bool
|
Treat corrections as a dryrun and just print all changes to be made |
False
|
Returns:
Name | Type | Description |
---|---|---|
Example |
Example
|
Example with fixed annotations |
Source code in recon/corrections.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
rename_labels(example, label_map)
¶
Rename labels in a copy of List[Example] data
Parameters:
Name | Type | Description | Default |
---|---|---|---|
example |
Example
|
Input Example |
required |
label_map |
Dict[str, str]
|
One-to-one mapping of label names |
required |
Returns:
Name | Type | Description |
---|---|---|
Example |
Example
|
Copy of Example with renamed labels |
Source code in recon/corrections.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
split_sentences(example, preprocessed_outputs={})
¶
Split a single example into multiple examples by splitting the text into multiple sentences and resetting entity and token offsets based on offsets relative to sentence boundaries
Parameters:
Name | Type | Description | Default |
---|---|---|---|
example |
Example
|
Input Example |
required |
preprocessed_outputs |
Dict[str, Any]
|
Outputs of preprocessors. |
{}
|
Returns:
Type | Description |
---|---|
List[Example]
|
List[Example]: List of split examples. Could be list of 1 if the example is just one sentence. |
Source code in recon/corrections.py
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
|
strip_annotations(example, *, strip_chars=['.', '!', '?', '-', ':', ' '], preprocessed_outputs={})
¶
Strip punctuation and spaces from start and end of annotations. These characters are almost always a mistake and will confuse a model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
example |
Example
|
Input Example |
required |
strip_chars |
List[str]
|
Characters to strip. |
['.', '!', '?', '-', ':', ' ']
|
Returns:
Name | Type | Description |
---|---|---|
Example |
Example
|
Example with stripped spans |
Source code in recon/corrections.py
142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
|