On using config files with python’s argparse
After searching far and wide on a good method to [easily] integrate config files (with the possibility if specifying multiple config files overriding each other) and not finding much I thought I’d share some ideas for prosperity (comments and feedback are of course always welcome).
There are of course a multitude of solutions out there consisting of various libraries with more or less features, such as sacred or hydra, but I tend to be a purist, both to minimize dependencies and to stick with a familiar command line interface without too much obscure syntax.
The idea is simple
- Use argparse, a core python library with a familiar unix type command line interface, to specify command line options
- Allow specifying arbitrary default values via the argparse interface
- Allow specifying multiple config files on the command line that override each other in order
- Override config file values with command line values
The solution turned out to be simpler than expected, albeit with three minor drawbacks, or advantages, depending on point of view, that might require a little more overhead for production environments (unless you take the approach that if the user wants to step on their own feet it’s their predicament of course)
- There is no type checking on config file input
- There is no check whether config file values actually appear in the argument least
- store_true and store_false flags do not work well , although two work around are discussed bellow
The first allows changing argument type, something that can be useful, or cause issues, depending on context, the second allows changing flags during development without tracking required changes to old config files, again, can be useful or a source of bugs (using an incorrect flag in the config file).
Adding config files support
Finally, to the point, first of course we need to allow config files on the command line
import argparseparser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('--conf', action='append')
Now, the trick, load the config files and use the content to change argument defaults, I went with json formatting, although different parsers can be used for different formats.
import jsonargs = parser.parse_args()
if args.conf is not None:
for conf_fname in args.conf:
with open(conf_fname, 'r') as f:
parser.set_defaults(**json.load(f))
# Reload arguments to override config file values with command line values
args = parser.parse_args()
and that is it essentially.
Although it may be useful to add a config dump command line flag to verify / save configuration settings.
import sysparser.add_argument('--conf_export', , action='store_true')if args.conf_export:
tmp_args = vars(args).copy()
del tmp_args['conf_export'] # Do not dump value of conf_export flag
del tmp_args['conf'] # Values already loaded
print(json.dumps(tmp_args, indent=4, sort_keys=True)) sys.exit(-1) # Optional, if system should on print config and exit
Dealing with optional arguments with defaults
Now to dealing with optional arguments with defaults and alternatives to store_true options. The symplest solution is to use the nargs='?'
options. Thus of we set
parser.add_argument('--input', nargs='?', default='file')
we can just pass --input
with no argument to change the value back to false. The same trick can be used to bypass store_true
and store_false
arguments. Either use an integer (zero/non-zero) value to denote boolleans, or use nargs='?'
to turn the option into a flag that sets the argument to None
if set.
Module specific options
Now the interesting part. I like to go by the recommendations given in the pytorch_lightning documentation to set module specific arguments inside each of the modules where they are known, but here is a trick to do this without defining the parameters in multiple places and minimizing the pollution of the self.__dict__ namespace
First, define a static add_argparse_args
function to your class, and add the options to the parent parser
class foo:
@staticmethod
def add_argparse_args(parent_parser=None):
parser = argparse.ArgumentParser(
parents=[parent_parser] if parent_parser is not None else [],
add_help=False)
parser.add_argument('--foo', type=int, default=0) return parserimport argparseparser = argparse.ArgumentParser(description=__doc__)
parser = foo.add_argarse_args(parser)args = parser.parse_args()
Now, let us augment this a little by adding an __init__ function in order to process module arguments. This version ignores unknown named arguments, although can be easily changed to throw an error or warning by adding an else option to the if clause
class foo:
def __init__(self, **kwargs):
parser = foo.add_argparse_args()
# This only stores named arguments that are defined in add_argparse_args
for action in parser._actions:
if action.dest in kwargs:
action.default = kwargs[action.dest]
args = parser.parse_args([])
self.__dict__.update(vars(args)) @staticmethod
def add_argparse_args(parent_parser=None):
parser = argparse.ArgumentParser(
parents=[parent_parser] if parent_parser is not None else [],
add_help=False)
parser.add_argument('--foo', type=int, default=0)return parser
for pytorch_lightning you will want to use self.hparams.update(vars(args))
instead of self.__dict__
And voila