actio_python_utils.argparse_functions.EnhancedArgumentParser¶
- class actio_python_utils.argparse_functions.EnhancedArgumentParser(*args, description='The Sphinx documentation toolchain.', formatter_class=<class 'actio_python_utils.argparse_functions.CustomFormatter'>, use_logging=False, use_database=False, use_spark=False, use_xml=False, use_glow=False, use_spark_db=False, dont_create_db_connection=False, spark_extra_packages=None, **kwargs)[source]¶
Bases:
ArgumentParserCustomized
argparse.ArgumentParserthat sets description automatically, uses bothargparse.ArgumentDefaultsHelpFormatterandargparse.RawTextHelpFormatterformatters, optionally sets up logging, database, and PySpark connections.- Parameters:
*args – Optional positional arguments passed to
argparse.ArgumentParser()constructordescription (
str, default:'The Sphinx documentation toolchain.') – Passed toargparse.ArgumentParser()constructorformatter_class (
HelpFormatter, default:<class 'actio_python_utils.argparse_functions.CustomFormatter'>) – The help formatter to useuse_logging (
bool, default:False) – Adds log level and log format arguments, then sets up parsing whenparse_args()is calleduse_database (
bool, default:False) – Adds a database service argument, then creates a connection to the specified database with the attribute name db whenparse_args()is calleduse_spark (
bool, default:False) – Adds spark cores, spark memory, and spark config arguments, then creates a PySpark session with the attribute name spark whenparse_args()is calleduse_xml (
bool, default:False) – Adds dependencies to PySpark to parse XML files; setsuse_spark = Trueuse_glow (
bool, default:False) – Adds dependencies to PySpark to use glow, e.g. to parse VCF files; setsuse_spark = Trueuse_spark_db (
bool, default:False) – Adds dependencies to PySpark to connect to a database; setsuse_spark = Trueand creates an object to create a database connection with PySpark with the attribute name ``spark_db` whenparse_args()is calleddont_create_db_connection (
bool, default:False) – Don’t create a database connection even ifuse_database = Truespark_extra_packages (
Optional[Iterable[tuple[str,str]]], default:None) – Adds additional Spark package dependencies to initialize; setsuse_spark = True**kwargs – Any additional named arguments
- __init__(*args, description='The Sphinx documentation toolchain.', formatter_class=<class 'actio_python_utils.argparse_functions.CustomFormatter'>, use_logging=False, use_database=False, use_spark=False, use_xml=False, use_glow=False, use_spark_db=False, dont_create_db_connection=False, spark_extra_packages=None, **kwargs)[source]¶
Methods
__init__(*args[, description, ...])add_argument([short_arg, long_arg])Adds an argument while retaining metavar instead of dest in help message
add_argument_group(*args, **kwargs)add_db_service_argument([short_arg, ...])Adds an argument to set the database service name sets
dest = "db_service"add_log_format_argument([short_arg, ...])Adds an argument to set the logging format and sets
dest = "log_format"add_log_level_argument([short_arg, ...])Adds an argument to set the logging level, converts it to the proper integer, and sets
dest = "log_level"add_mutually_exclusive_group(**kwargs)add_spark_config_argument([short_arg, long_arg])Adds an argument to provide 0 or more options to initialize the PySpark session with and sets
dest = "spark_config"add_spark_cores_argument([short_arg, ...])Adds an argument to set the number of PySpark cores to use and sets
dest = "spark_cores"add_spark_load_config_argument([short_arg, ...])Adds an argument to provide 0 or more options to load a dataframe in PySpark with and sets
dest = "spark_load_config"add_spark_memory_argument([short_arg, ...])Adds an argument to set the amount of memory to give to PySpark and sets
dest = "spark_memory"add_subparsers(**kwargs)convert_arg_line_to_args(arg_line)error(message)Prints a usage message incorporating the message to stderr and exits.
exit([status, message])format_help()format_usage()get_default(dest)parse_args(*args[, db_connection_name, ...])Parses arguments while optionally setting up logging, database, and/or PySpark.
parse_intermixed_args([args, namespace])parse_known_args([args, namespace])parse_known_intermixed_args([args, namespace])print_help([file])print_usage([file])register(registry_name, value, object)sanitize_argument(long_arg)Converts the argument name to the variable actually used
set_defaults(**kwargs)setup_database(args)Returns a psycopg2 connection to the database specified in args.db_service
setup_logging(args[, name, stream, ...])Sets up logging with
setup_logging()and specified log level and formatsetup_spark(args)Returns a list with a created PySpark session and optionally a PostgreSQL login record if
use_spark_db = True- add_argument(short_arg=None, long_arg=None, *args, **kwargs)[source]¶
Adds an argument while retaining metavar instead of dest in help message
- Parameters:
short_arg (
Optional[str], default:None) – The short argument namelong_arg (
Optional[str], default:None) – The long argument name*args – Any additional positional arguments
**kwargs – Any additional named arguments
- Return type:
None
- add_db_service_argument(short_arg='-s', long_arg='--service', default=None, **kwargs)[source]¶
Adds an argument to set the database service name sets
dest = "db_service"- Parameters:
short_arg (
Optional[str], default:'-s') – Short argument name to uselong_arg (
Optional[str], default:'--service') – Long argument name to usedefault (
Optional[str], default:None) – Default service**kwargs – Any additional named arguments
- add_log_format_argument(short_arg='-f', long_arg='--log-format', default='%(asctime)s - %(name)s - %(levelname)s - %(message)s', **kwargs)[source]¶
Adds an argument to set the logging format and sets
dest = "log_format"- Parameters:
short_arg (
Optional[str], default:'-f') – Short argument name to uselong_arg (
Optional[str], default:'--log-format') – Long argument name to usedefault (
str, default:'%(asctime)s - %(name)s - %(levelname)s - %(message)s') – Default logging format**kwargs – Any additional named arguments
- Return type:
None
- add_log_level_argument(short_arg='-l', long_arg='--log-level', default='INFO', **kwargs)[source]¶
Adds an argument to set the logging level, converts it to the proper integer, and sets
dest = "log_level"- Parameters:
short_arg (
Optional[str], default:'-l') – Short argument name to uselong_arg (
Optional[str], default:'--log-level') – Long argument name to usedefault (
str, default:'INFO') – Default logging level value**kwargs – Any additional named arguments
- Return type:
None
- add_spark_config_argument(short_arg=None, long_arg='--spark-config', **kwargs)[source]¶
Adds an argument to provide 0 or more options to initialize the PySpark session with and sets
dest = "spark_config"- Parameters:
short_arg (
Optional[str], default:None) – Short argument name to uselong_arg (
Optional[str], default:'--spark-config') – Long argument name to use**kwargs – Any additional named arguments
- Return type:
None
- add_spark_cores_argument(short_arg='-c', long_arg='--spark-cores', default='*', **kwargs)[source]¶
Adds an argument to set the number of PySpark cores to use and sets
dest = "spark_cores"- Parameters:
short_arg (
Optional[str], default:'-c') – Short argument name to uselong_arg (
Optional[str], default:'--spark-cores') – Long argument name to usedefault (
int|str, default:'*') – Default cores**kwargs – Any additional named arguments
- Return type:
None
- add_spark_load_config_argument(short_arg=None, long_arg='--spark-load-config', **kwargs)[source]¶
Adds an argument to provide 0 or more options to load a dataframe in PySpark with and sets
dest = "spark_load_config"- Parameters:
short_arg (
Optional[str], default:None) – Short argument name to uselong_arg (
Optional[str], default:'--spark-load-config') – Long argument name to use**kwargs – Any additional named arguments
- Return type:
None
- add_spark_memory_argument(short_arg='-m', long_arg='--spark-memory', default='1g', **kwargs)[source]¶
Adds an argument to set the amount of memory to give to PySpark and sets
dest = "spark_memory"- Parameters:
short_arg (
Optional[str], default:'-m') – Short argument name to uselong_arg (
Optional[str], default:'--spark-memory') – Long argument name to usedefault (
str, default:'1g') – Default memory to use**kwargs – Any additional named arguments
- Return type:
None
- error(message: string)¶
Prints a usage message incorporating the message to stderr and exits.
If you override this in a subclass, it should not return – it should either exit or raise an exception.
- parse_args(*args, db_connection_name='db', spark_name='spark', spark_db_name='spark_db', **kwargs)[source]¶
Parses arguments while optionally setting up logging, database, and/or PySpark.
- Parameters:
*args – Any additional positional arguments
db_connection_name (
str, default:'db') – Theargsattribute name to give to a created database connectionspark_name (
str, default:'spark') – Theargsattribute name to give to a created PySpark sessionspark_db_name (
str, default:'spark_db') – Theargsattribute name to give to PostgreSQL login credentials for use with PySpark**kwargs – Any additional named arguments
- Return type:
Namespace- Returns:
Parsed arguments, additionally with attribute
dbas a database connection ifuse_database = True, with attributesparkifuse_spark = True, and attributespark_dbifuse_spark_db = True
- static sanitize_argument(long_arg)[source]¶
Converts the argument name to the variable actually used
- Parameters:
long_arg (
str) – The argument name- Return type:
str- Returns:
The reformatted argument
- setup_database(args)[source]¶
Returns a psycopg2 connection to the database specified in args.db_service
- Parameters:
args (
Namespace) – Parsed arguments fromparse_args()- Return type:
connection- Returns:
The psycopg2 connection
- setup_logging(args, name='root', stream=None, stream_handler_logging_level=None)[source]¶
Sets up logging with
setup_logging()and specified log level and format- Parameters:
args (
Namespace) – Parsed arguments fromparse_args()name (
str, default:'root') – Logger name to initializestream (
Optional[TextIOWrapper], default:None) – Stream to log tostream_handler_logging_level (
Union[int,str,None], default:None) – Logging level to use for stream
- Return type:
None
- setup_spark(args)[source]¶
Returns a list with a created PySpark session and optionally a PostgreSQL login record if
use_spark_db = True- Parameters:
args (
Namespace) – Parsed arguments fromparse_args()- Return type:
tuple[SparkSession,PassEntry]- Returns:
A list with the created PySpark session and either a
pgtoolkit.pgpass.PassEntryrecord orNone