glazing.cli.convert¶
Convert command implementation.
convert
¶
CLI commands for converting datasets to JSON Lines format.
This module provides commands for converting linguistic datasets from their native formats (XML, database) to JSON Lines format for efficient processing.
Commands
convert dataset Convert a specific dataset or all datasets to JSON Lines. convert list List available datasets for conversion. convert info Get information about a dataset's conversion process.
| CLASS | DESCRIPTION |
|---|---|
DatasetInfoDict |
Type for dataset information. |
| FUNCTION | DESCRIPTION |
|---|---|
convert |
Convert linguistic datasets to JSON Lines format. |
convert_dataset_cmd |
Convert a dataset to JSON Lines format. |
convert_framenet |
Convert FrameNet XML frames to JSON Lines with lexical units. |
convert_propbank |
Convert PropBank XML framesets to JSON Lines. |
convert_verbnet |
Convert VerbNet XML files to JSON Lines. |
convert_wordnet |
Convert WordNet database to JSON Lines. |
dataset_info_cmd |
Get information about a dataset's conversion process. |
list_datasets |
List available datasets for conversion. |
Classes¶
DatasetInfoDict
¶
Bases: TypedDict
Type for dataset information.
Functions¶
convert() -> None
¶
convert_dataset_cmd(dataset: DatasetName, input_dir: str | Path, output_dir: str | Path, verbose: bool) -> None
¶
Convert a dataset to JSON Lines format.
Examples:
Convert VerbNet: $ glazing convert dataset --dataset verbnet \ --input-dir vn3.4/verbnet3.4 --output-dir output/
Convert all datasets: $ glazing convert dataset --dataset all --input-dir data/ --output-dir output/
Source code in src/glazing/cli/convert.py
convert_framenet(input_dir: Path, output_dir: Path, verbose: bool = False) -> None
¶
Convert FrameNet XML frames to JSON Lines with lexical units.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing FrameNet XML files (should have frame/ subdirectory).
TYPE:
|
output_dir
|
Directory to write JSON Lines files to.
TYPE:
|
verbose
|
Show verbose output.
TYPE:
|
Source code in src/glazing/cli/convert.py
convert_propbank(input_dir: Path, output_dir: Path, verbose: bool = False) -> None
¶
Convert PropBank XML framesets to JSON Lines.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing PropBank XML files.
TYPE:
|
output_dir
|
Directory to write JSON Lines files to.
TYPE:
|
verbose
|
Show verbose output.
TYPE:
|
Source code in src/glazing/cli/convert.py
convert_verbnet(input_dir: Path, output_dir: Path, verbose: bool = False) -> None
¶
Convert VerbNet XML files to JSON Lines.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing VerbNet XML files.
TYPE:
|
output_dir
|
Directory to write JSON Lines files to.
TYPE:
|
verbose
|
Show verbose output.
TYPE:
|
Source code in src/glazing/cli/convert.py
convert_wordnet(input_dir: Path, output_dir: Path, verbose: bool = False) -> None
¶
Convert WordNet database to JSON Lines.
| PARAMETER | DESCRIPTION |
|---|---|
input_dir
|
Directory containing WordNet database files.
TYPE:
|
output_dir
|
Directory to write JSON Lines files to.
TYPE:
|
verbose
|
Show verbose output.
TYPE:
|
Source code in src/glazing/cli/convert.py
dataset_info_cmd(dataset: str) -> None
¶
Get information about a dataset's conversion process.
Source code in src/glazing/cli/convert.py
list_datasets() -> None
¶
List available datasets for conversion.