Handling argparse escaped character as option

2018-06-22 01:55:01

The argparse library handles escaped characters (like t to tab and n to newline) differently than I prefer. An answer to this question gives a solution but I would like to make it less visible to the user.

Given the program:

#!/usr/bin/env python3
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('-d', '--delimiter', default='t')
args = parser.parse_args()
print(args)

You will receive this output:

bash$ parser.py -d t
Namespace(delimiter='t')

bash$ parser.py -d t
Namespace(delimiter='t')

bash$ parser.py -d 't'
Namespace(delimiter='t')

bash$ parser.py -d 't'
Namespace(delimiter='\t')

bash$ parser.py -d "t"
Namespace(delimiter='t')

bash$ parser.py -d "t"
Namespace(delimiter='t')

bash$ parser.py -d $'t'
Namespace(delimiter='t')

bash$ parser.py -d $'t'
Namespace(delimiter='t')

bash$ parser.py -d $"t"
Namespace(delimiter='$t')

bash$ parser.py -d $"t"
Namespace(delimiter='$t')

I get the desired argument only with

parser.py -d $'t'

but I would prefer the input to look something like

parser.py -d t

or less preferably

parser.py -d 't'
parser.py -d "t"

If I want to change the behavior, is this something I can do using the argparse library? If not, is it possible for me to write the behavior on top of the existing argparse library? If not, is this just the way that bash passes arguments to argparse therefore out of my hands? If that is true, is this something that is usually documented to users or is this behavior assumed to be normal?

The string that you see in the namespace is exactly the string that appears in sys.argv - which was created by bash and the interpreter. The parser does not process or tweak this string. It just sets the value in the namespace . You can verify this by print sys.argv before parsing.

If it is clear to you what the user wants, then I'd suggest modifying args.delimiter after parsing. The primary purpose of the parser is to figure out what the user wants. You, as programmer, can interpert and apply that information in any way.

Once you've worked out a satisfactory post-parsing function, you could implement it as a type for this argument (like what int() and float() do for numerical strings). But focus on the post-parsing processing.

Assuming that the question was partially about how to carry out the post-processing explained by @hpaulj and since I couldn't see an immediate solution for Python 3 in the links above, here is a quick solution:

import codecs

def unescaped_str(arg_str):
    return codecs.decode(str(arg_str), 'unicode_escape')

then in the parser:

parser.add_argument('-d', '--delimiter', type=unescaped_str, default='t')

This will make your less desirable cases work:

parser.py -d 't'
parser.py -d "t"

But not the desired unescaped t . In any case, this solution can be dangerous as there is no check mechanism...

Personally I would just expect that behavior—your shell interprets some items and passes either a literal tab, or a backslash and a letter-t—and not necessarily want the Python program to do a second level of interpretation (and there's nothing in argparse to do it).

That said, though, Python has built in interpreters for this; see this question and answers.

链接地址: http://www.djcxy.com/p/61970.html

上一篇: 不同的选项集

下一篇: 作为选项处理argparse转义字符