actio_python_utils.argparse_functions.EnhancedArgumentParser¶
- class actio_python_utils.argparse_functions.EnhancedArgumentParser(*args, description='The Sphinx documentation toolchain.', formatter_class=<class 'actio_python_utils.argparse_functions.CustomFormatter'>, use_logging=False, use_database=False, use_spark=False, use_xml=False, use_glow=False, use_spark_db=False, dont_create_db_connection=False, spark_extra_packages=None, **kwargs)[source]¶
Bases:
ArgumentParser
Customized
argparse.ArgumentParser
that sets description automatically, uses bothargparse.ArgumentDefaultsHelpFormatter
andargparse.RawTextHelpFormatter
formatters, optionally sets up logging, database, and PySpark connections.- Parameters:
*args – Optional positional arguments passed to
argparse.ArgumentParser()
constructordescription (
str
, default:'The Sphinx documentation toolchain.'
) – Passed toargparse.ArgumentParser()
constructorformatter_class (
HelpFormatter
, default:<class 'actio_python_utils.argparse_functions.CustomFormatter'>
) – The help formatter to useuse_logging (
bool
, default:False
) – Adds log level and log format arguments, then sets up parsing whenparse_args()
is calleduse_database (
bool
, default:False
) – Adds a database service argument, then creates a connection to the specified database with the attribute name db whenparse_args()
is calleduse_spark (
bool
, default:False
) – Adds spark cores, spark memory, and spark config arguments, then creates a PySpark session with the attribute name spark whenparse_args()
is calleduse_xml (
bool
, default:False
) – Adds dependencies to PySpark to parse XML files; setsuse_spark = True
use_glow (
bool
, default:False
) – Adds dependencies to PySpark to use glow, e.g. to parse VCF files; setsuse_spark = True
use_spark_db (
bool
, default:False
) – Adds dependencies to PySpark to connect to a database; setsuse_spark = True
and creates an object to create a database connection with PySpark with the attribute name ``spark_db` whenparse_args()
is calleddont_create_db_connection (
bool
, default:False
) – Don’t create a database connection even ifuse_database = True
spark_extra_packages (
Optional
[Iterable
[tuple
[str
,str
]]], default:None
) – Adds additional Spark package dependencies to initialize; setsuse_spark = True
**kwargs – Any additional named arguments
- __init__(*args, description='The Sphinx documentation toolchain.', formatter_class=<class 'actio_python_utils.argparse_functions.CustomFormatter'>, use_logging=False, use_database=False, use_spark=False, use_xml=False, use_glow=False, use_spark_db=False, dont_create_db_connection=False, spark_extra_packages=None, **kwargs)[source]¶
Methods
__init__
(*args[, description, ...])add_argument
([short_arg, long_arg])Adds an argument while retaining metavar instead of dest in help message
add_argument_group
(*args, **kwargs)add_db_service_argument
([short_arg, ...])Adds an argument to set the database service name sets
dest = "db_service"
add_log_format_argument
([short_arg, ...])Adds an argument to set the logging format and sets
dest = "log_format"
add_log_level_argument
([short_arg, ...])Adds an argument to set the logging level, converts it to the proper integer, and sets
dest = "log_level"
add_mutually_exclusive_group
(**kwargs)add_spark_config_argument
([short_arg, long_arg])Adds an argument to provide 0 or more options to initialize the PySpark session with and sets
dest = "spark_config"
add_spark_cores_argument
([short_arg, ...])Adds an argument to set the number of PySpark cores to use and sets
dest = "spark_cores"
add_spark_load_config_argument
([short_arg, ...])Adds an argument to provide 0 or more options to load a dataframe in PySpark with and sets
dest = "spark_load_config"
add_spark_memory_argument
([short_arg, ...])Adds an argument to set the amount of memory to give to PySpark and sets
dest = "spark_memory"
add_subparsers
(**kwargs)convert_arg_line_to_args
(arg_line)error
(message)Prints a usage message incorporating the message to stderr and exits.
exit
([status, message])format_help
()format_usage
()get_default
(dest)parse_args
(*args[, db_connection_name, ...])Parses arguments while optionally setting up logging, database, and/or PySpark.
parse_intermixed_args
([args, namespace])parse_known_args
([args, namespace])parse_known_intermixed_args
([args, namespace])print_help
([file])print_usage
([file])register
(registry_name, value, object)sanitize_argument
(long_arg)Converts the argument name to the variable actually used
set_defaults
(**kwargs)setup_database
(args)Returns a psycopg2 connection to the database specified in args.db_service
setup_logging
(args[, name, stream, ...])Sets up logging with
setup_logging()
and specified log level and formatsetup_spark
(args)Returns a list with a created PySpark session and optionally a PostgreSQL login record if
use_spark_db = True
- add_argument(short_arg=None, long_arg=None, *args, **kwargs)[source]¶
Adds an argument while retaining metavar instead of dest in help message
- Parameters:
short_arg (
Optional
[str
], default:None
) – The short argument namelong_arg (
Optional
[str
], default:None
) – The long argument name*args – Any additional positional arguments
**kwargs – Any additional named arguments
- Return type:
None
- add_db_service_argument(short_arg='-s', long_arg='--service', default=None, **kwargs)[source]¶
Adds an argument to set the database service name sets
dest = "db_service"
- Parameters:
short_arg (
Optional
[str
], default:'-s'
) – Short argument name to uselong_arg (
Optional
[str
], default:'--service'
) – Long argument name to usedefault (
Optional
[str
], default:None
) – Default service**kwargs – Any additional named arguments
- add_log_format_argument(short_arg='-f', long_arg='--log-format', default='%(asctime)s - %(name)s - %(levelname)s - %(message)s', **kwargs)[source]¶
Adds an argument to set the logging format and sets
dest = "log_format"
- Parameters:
short_arg (
Optional
[str
], default:'-f'
) – Short argument name to uselong_arg (
Optional
[str
], default:'--log-format'
) – Long argument name to usedefault (
str
, default:'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
) – Default logging format**kwargs – Any additional named arguments
- Return type:
None
- add_log_level_argument(short_arg='-l', long_arg='--log-level', default='INFO', **kwargs)[source]¶
Adds an argument to set the logging level, converts it to the proper integer, and sets
dest = "log_level"
- Parameters:
short_arg (
Optional
[str
], default:'-l'
) – Short argument name to uselong_arg (
Optional
[str
], default:'--log-level'
) – Long argument name to usedefault (
str
, default:'INFO'
) – Default logging level value**kwargs – Any additional named arguments
- Return type:
None
- add_spark_config_argument(short_arg=None, long_arg='--spark-config', **kwargs)[source]¶
Adds an argument to provide 0 or more options to initialize the PySpark session with and sets
dest = "spark_config"
- Parameters:
short_arg (
Optional
[str
], default:None
) – Short argument name to uselong_arg (
Optional
[str
], default:'--spark-config'
) – Long argument name to use**kwargs – Any additional named arguments
- Return type:
None
- add_spark_cores_argument(short_arg='-c', long_arg='--spark-cores', default='*', **kwargs)[source]¶
Adds an argument to set the number of PySpark cores to use and sets
dest = "spark_cores"
- Parameters:
short_arg (
Optional
[str
], default:'-c'
) – Short argument name to uselong_arg (
Optional
[str
], default:'--spark-cores'
) – Long argument name to usedefault (
int
|str
, default:'*'
) – Default cores**kwargs – Any additional named arguments
- Return type:
None
- add_spark_load_config_argument(short_arg=None, long_arg='--spark-load-config', **kwargs)[source]¶
Adds an argument to provide 0 or more options to load a dataframe in PySpark with and sets
dest = "spark_load_config"
- Parameters:
short_arg (
Optional
[str
], default:None
) – Short argument name to uselong_arg (
Optional
[str
], default:'--spark-load-config'
) – Long argument name to use**kwargs – Any additional named arguments
- Return type:
None
- add_spark_memory_argument(short_arg='-m', long_arg='--spark-memory', default='1g', **kwargs)[source]¶
Adds an argument to set the amount of memory to give to PySpark and sets
dest = "spark_memory"
- Parameters:
short_arg (
Optional
[str
], default:'-m'
) – Short argument name to uselong_arg (
Optional
[str
], default:'--spark-memory'
) – Long argument name to usedefault (
str
, default:'1g'
) – Default memory to use**kwargs – Any additional named arguments
- Return type:
None
- error(message: string)¶
Prints a usage message incorporating the message to stderr and exits.
If you override this in a subclass, it should not return – it should either exit or raise an exception.
- parse_args(*args, db_connection_name='db', spark_name='spark', spark_db_name='spark_db', **kwargs)[source]¶
Parses arguments while optionally setting up logging, database, and/or PySpark.
- Parameters:
*args – Any additional positional arguments
db_connection_name (
str
, default:'db'
) – Theargs
attribute name to give to a created database connectionspark_name (
str
, default:'spark'
) – Theargs
attribute name to give to a created PySpark sessionspark_db_name (
str
, default:'spark_db'
) – Theargs
attribute name to give to PostgreSQL login credentials for use with PySpark**kwargs – Any additional named arguments
- Return type:
Namespace
- Returns:
Parsed arguments, additionally with attribute
db
as a database connection ifuse_database = True
, with attributespark
ifuse_spark = True
, and attributespark_db
ifuse_spark_db = True
- static sanitize_argument(long_arg)[source]¶
Converts the argument name to the variable actually used
- Parameters:
long_arg (
str
) – The argument name- Return type:
str
- Returns:
The reformatted argument
- setup_database(args)[source]¶
Returns a psycopg2 connection to the database specified in args.db_service
- Parameters:
args (
Namespace
) – Parsed arguments fromparse_args()
- Return type:
connection
- Returns:
The psycopg2 connection
- setup_logging(args, name='root', stream=None, stream_handler_logging_level=None)[source]¶
Sets up logging with
setup_logging()
and specified log level and format- Parameters:
args (
Namespace
) – Parsed arguments fromparse_args()
name (
str
, default:'root'
) – Logger name to initializestream (
Optional
[TextIOWrapper
], default:None
) – Stream to log tostream_handler_logging_level (
Union
[int
,str
,None
], default:None
) – Logging level to use for stream
- Return type:
None
- setup_spark(args)[source]¶
Returns a list with a created PySpark session and optionally a PostgreSQL login record if
use_spark_db = True
- Parameters:
args (
Namespace
) – Parsed arguments fromparse_args()
- Return type:
tuple
[SparkSession
,PassEntry
]- Returns:
A list with the created PySpark session and either a
pgtoolkit.pgpass.PassEntry
record orNone