glazing.verbnet.converter¶
Converting VerbNet XML to JSON Lines.
converter
¶
VerbNet XML to JSON Lines converter.
This module provides conversion from VerbNet XML format to JSON Lines format using the glazing VerbNet models. Handles verb class hierarchy with role inheritance, selectional restrictions with complex logic, and cross-references.
| CLASS | DESCRIPTION |
|---|---|
VerbNetConverter |
Convert VerbNet XML files to JSON Lines format. |
| FUNCTION | DESCRIPTION |
|---|---|
convert_verbnet_file |
Convert a single VerbNet XML file to VerbClass model. |
convert_verbnet_directory |
Convert all VerbNet XML files in a directory to JSON Lines. |
parse_member_cross_references |
Parse cross-references from member attributes. |
parse_selectional_restrictions |
Parse selectional restrictions with nested logic. |
Examples:
>>> from pathlib import Path
>>> from glazing.verbnet.converter import VerbNetConverter
>>> converter = VerbNetConverter()
>>> verb_class = converter.convert_verbnet_file("verbnet/give-13.1.xml")
>>> print(verb_class.id)
'give-13.1'
>>> # Convert entire directory
>>> converter.convert_verbnet_directory(
... input_dir="verbnet_v34",
... output_file="verbnet.jsonl"
... )
Classes¶
VerbNetConverter
¶
Convert VerbNet XML files to JSON Lines format.
Handles VerbNet XML parsing with proper inheritance resolution, cross-reference extraction, and complex selectional restrictions.
| METHOD | DESCRIPTION |
|---|---|
convert_verbnet_file |
Convert a single VerbNet XML file to VerbClass model. |
convert_verbnet_directory |
Convert all VerbNet XML files to JSON Lines. |
parse_verb_class |
Parse a VNCLASS element into VerbClass model. |
parse_members |
Parse MEMBERS element into list of Member models. |
parse_themroles |
Parse THEMROLES element into list of ThematicRole models. |
parse_frames |
Parse FRAMES element into list of VNFrame models. |
Functions¶
convert_verbnet_directory(input_dir: Path | str, output_file: Path | str) -> int
¶
Convert all VerbNet XML files in a directory to JSON Lines.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing VerbNet XML files.
TYPE:
|
output_file
|
Output JSON Lines file path.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of files processed. |
| RAISES | DESCRIPTION |
|---|---|
FileNotFoundError
|
If the input directory does not exist. |
Source code in src/glazing/verbnet/converter.py
convert_verbnet_file(filepath: Path | str) -> VerbClass
¶
Convert a single VerbNet XML file to VerbClass model.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to VerbNet XML file.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
VerbClass
|
Parsed VerbClass model with all subclasses. |
| RAISES | DESCRIPTION |
|---|---|
FileNotFoundError
|
If the input file does not exist. |
ValueError
|
If XML parsing fails or structure is invalid. |
Source code in src/glazing/verbnet/converter.py
parse_verb_class(element: etree._Element, parent_id: VerbClassID | None = None) -> VerbClass
¶
Parse a VNCLASS element into VerbClass model.
| PARAMETER | DESCRIPTION |
|---|---|
element
|
VNCLASS XML element.
TYPE:
|
parent_id
|
Parent class ID for subclasses.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
VerbClass
|
Parsed VerbClass model. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If required attributes are missing or invalid. |
Source code in src/glazing/verbnet/converter.py
Functions¶
convert_verbnet_directory(input_dir: Path | str, output_file: Path | str) -> int
¶
Convert all VerbNet XML files in a directory to JSON Lines.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing VerbNet XML files.
TYPE:
|
output_file
|
Output JSON Lines file path.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
Number of files processed. |
Source code in src/glazing/verbnet/converter.py
convert_verbnet_file(filepath: Path | str) -> VerbClass
¶
Convert a single VerbNet XML file to VerbClass model.
| PARAMETER | DESCRIPTION |
|---|---|
filepath
|
Path to VerbNet XML file.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
VerbClass
|
Parsed VerbClass model. |