微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

Emacs 中的 Python:跳转到全局常量的定义

如何解决Emacs 中的 Python:跳转到全局常量的定义

为我的项目 (find . -name "*.py" | xargs etags) 创建 TAGS 文件后,我可以使用 M-. 跳转函数的定义。那太棒了。但是如果我想要一个全局常量的定义——比如 x = 3——Emacs 不知道在哪里可以找到它。

有什么方法可以向 Emacs 解释定义常量,而不仅仅是函数?对于在函数(或 for 循环或诸如此类)中定义的任何内容,我不需要它,只需要全局的。

更多细节

这个问题以前的化身使用“顶级”而不是“全局”,但在@Thomas 的帮助下,我意识到这是不精确的。我所说的全局定义是指模块定义的任何东西。因此在

import m

if m.foo:
  def f():
    x = 3
    return x
  y,z = 1,2
else:
  def f():
    x = 4
    return x
  y,z = 2,3
del(z)

模块定义的内容fy,尽管这些定义的站点向右缩进。 x为局部变量,在模块结束前删除z的定义。

相信捕获所有全局赋值的足够规则是在 def 表达式中简单地忽略它们(注意 def 关键字本身可能在任何位置缩进level),否则解析 = 左边的任何符号(注意可能不止一个,因为 Python 支持元组赋值)。

解决方法

Etags 似乎无法为 Python 文件生成此类信息,您可以通过在简单的测试文件上运行它来轻松验证:

x = 3

def fun():
    pass

运行 etags test.py 会生成一个包含以下内容的 TAGS 文件:

/tmp/test.py,13
def fun(3,7

如您所见,此文件中完全没有 x,因此 Emacs 没有机会找到它。

调用 etags 的手册页通知我们有一个选项 --globals

   --globals
          Create tag entries for global variables in  Perl  and  Makefile.
          This is the default in C and derived languages.

然而,这似乎是文档与实现不同步的可悲案例之一,因为此选项似乎不存在。 (etags -h 也没有列出它,只有 --no-globals - 可能是因为 --globals 是默认值,如上所述。)

然而,即使 --globals 是默认值,文档片段也表明它仅适用于 Perl、Makesfiles、C 和派生语言。我们可以通过创建另一个简单的测试文件来检查是否是这种情况,这次是针对 C:

int x = 3;

void fun() {
}

实际上,运行 etags test.c 会生成以下 TAGS 文件:

/tmp/test.c,26
int x 1,0
void fun(3,12

您看到 x 已被正确识别为 C。因此,对于 Python,etags 似乎根本不支持全局变量。

但是,由于 Python 使用空格,因此在源文件中识别全局变量定义并不太难——对于所有不以空格开头但包含 {{1} 的行,您基本上可以使用 grep }} 符号(当然也有例外)。

因此,我编写了以下脚本来执行此操作,您可以将其用作 = 的直接替代品,因为它在内部调用 etags

etags

使用方便的名称将此脚本存储在您的 #!/bin/bash # make sure that some input files are provided,or else there's # nothing to parse if [ $# -eq 0 ]; then # the following message is just a copy of etags' error message echo "$(basename ${0}): no input files specified." echo " Try '$(basename ${0}) --help' for a complete list of options." exit 1 fi # extract all non-flag parameters as the actual filenames to consider TAGS2="TAGS2" argflags=($(etags -h | grep '^-' | sed 's/,.*$//' | grep ' ' | awk '{print $1}')) files=() skip=0 for arg in "${@}"; do # the variable 'skip' signals arguments that should not be # considered as filenames,even though they don't start with a # hyphen if [ ${skip} -eq 0 ]; then # arguments that start with a hyphen are considered flags and # thus not added to the 'files' array if [ "${arg:0:1}" = '-' ]; then if [ "${arg:0:9}" = "--output=" ]; then TAGS2="${arg:9}2" else # however,since some flags take a parameter,we also # check whether we should skip the next command line # argument: the arguments for which this is the case are # contained in 'argflags' for argflag in ${argflags[@]}; do if [ "${argflag}" = "${arg}" ]; then # we need to skip the next 'arg',but in case the # current flag is '-o' we should still look at the # next 'arg' so as to update the path to the # output file of our own parsing below if [ "${arg}" = "-o" ]; then # the next 'arg' will be etags' output file skip=2 else skip=1 fi break fi done fi else files+=("${arg}") fi else # the current 'arg' is not an input file,but it may be the # path to the etags output file if [ "${skip}" = 2 ]; then TAGS2="${arg}2" fi skip=0 fi done # create a separate TAGS file specifically for global variables for file in "${files[@]}"; do # find all lines that are not indented,are not comments or # decorators,and contain a '=' character,then turn them into # TAGS format,except that the filename is prepended grep -P -Hbn '^[^[# \t].*=' "${file}" | sed -E 's/([0-9]+):([0-9]+):([^= \t]+)\s*=.*$/\3\x7f\1,\2/' done |\ # count the bytes of each entry - this is needed for the TAGS # specification while read line; do echo "$(echo $line | sed 's/^.*://' | wc -c):$line" done |\ # turn the information above into the correct TAGS file format awk -F: ' BEGIN { filename=""; numlines=0 } { if (filename != $2) { if (numlines > 0) { print "\x0c\n" filename "," bytes+1 for (i in lines) { print lines[i] delete lines[i] } } filename=$2 numlines=0 bytes=0 } lines[numlines++] = $3; bytes += $1; } END { if (numlines > 0) { print "\x0c\n" filename "," bytes+1 for (i in lines) print lines[i] } }' > "${TAGS2}" # now run the actual etags,instructing it to include the global # variables information if ! etags -i "${TAGS2}" "${@}"; then # if etags failed to create the TAGS file,also delete the TAGS2 # file /bin/rm -f "${TAGS2}" fi 上(我建议使用诸如 $PATH 之类的东西),然后像这样调用它:

etags+

除了创建 TAGS 文件之外,该脚本还为所有全局变量定义创建了一个 TAGS2 文件,并在原始 TAGS 文件中添加了一行引用后者。

从 Emacs 的角度来看,使用上没有区别。

,

另一个答案只考虑没有缩进的行来包含全局变量声明。虽然这有效地排除了函数和类定义的主体,但它遗漏了 if 声明中定义的全局变量。这样的声明并不少见,例如,根据所使用的操作系统而不同的常量等。

正如在问题下的评论中所指出的,任何静态分析都必然是不完美的,因为 Python 的动态特性使得无法完全准确地决定哪些变量是全局定义的,除非程序实际执行。

因此,以下也只是一个近似值。但是,它确实考虑了上面列出的 if 中的全局变量定义。由于这最好通过实际分析源文件的解析树来完成,因此 bash 脚本不再是合适的选择。不过,方便的是,Python 本身允许通过此处使用的 ast 包轻松访问解析树。

from argparse import ArgumentParser,SUPPRESS
import ast
from collections import Counter
from re import match as re_startswith
import os
import subprocess
import sys

# extract variable information from assign statements
def process_assign(target,results):
    if isinstance(target,ast.Name):
        results.append((target.lineno,target.col_offset,target.id))
    elif isinstance(target,ast.Tuple):
        for child in ast.iter_child_nodes(target):
            process_assign(child,results)

# extract variable information from delete statements
def process_delete(target,ast.Name):
        results[:] = filter(lambda t: t[2] != target.id,results)
    elif isinstance(target,ast.Tuple):
        for child in ast.iter_child_nodes(target):
            process_delete(child,results)

# recursively walk the parse tree of the source file
def process_node(node,results):
    if isinstance(node,ast.Assign):
        for target in node.targets:
            process_assign(target,results)
    elif isinstance(node,ast.Delete):
        for target in node.targets:
            process_delete(target,results)
    elif type(node) not in [ast.FunctionDef,ast.ClassDef]:
        for child in ast.iter_child_nodes(node):
            process_node(child,results)

def get_arg_parser():
    # create the parser to configure
    parser = ArgumentParser(usage=SUPPRESS,add_help=False)

    # run etags to find out about the supported command line parameters
    dashlines = list(filter(lambda line: re_startswith('\\s*-',line),subprocess.check_output(['etags','-h'],encoding='utf-8').split('\n')))

    # ignore lines that start with a dash but don't have the right
    # indentation
    most_common_indent = max([(v,k) for k,v in
                              Counter([line.index('-') for line in dashlines]).items()])[1]
    arglines = filter(lambda line: line.index('-') == most_common_indent,dashlines)

    for argline in arglines:
        # the various 'argline' entries contain the command line
        # arguments for etags,sometimes more than one separated by
        # commas.
        for arg in argline.split(','):
            if 'or' in arg:
                arg = arg[:arg.index('or')]
            if ' ' in arg or '=' in arg:
                arg = arg[:min(arg.index(' ') if ' ' in arg else len(arg),arg.index('=') if '=' in arg else len(arg))]
                action='store'
            else:
                action='store_true'
            arg = arg.strip()
            if arg and not (arg == '-h' or arg == '--help'):
                parser.add_argument(arg,action=action)

    # we know we need files to run on
    parser.add_argument('files',nargs='*',metavar='file')

    # the parser is configured now to accept all of etags' arguments
    return parser


if __name__ == '__main__':
    # construct a parser for the command line arguments,unless
    # -h/-help/--help is given in which case we just print the help
    # screen
    etags_args = sys.argv[1:]
    if '-h' in etags_args or '-help' in etags_args or '--help' in etags_args:
        unknown_args = True
    else:
        argparser = get_arg_parser()
        known_ns,unknown_args = argparser.parse_known_args()

    # if something's wrong with the command line arguments,print
    # etags' help screen and exit
    if unknown_args:
        subprocess.run(['etags',encoding='utf-8')
        sys.exit(1)

    # we base the output filename on the TAGS file name.  Other than
    # that,we only care about the actual filenames to parse,and all
    # other command line arguments are simply passed to etags later on
    tags_file = 'TAGS2' if hasattr(known_ns,'o') is None else known_ns.o + '2'
    filenames = known_ns.files

    if filenames:
        # TAGS file sections,one per source file
        sections = []

        # process all files to populate the 'sections' list
        for filename in filenames:
            # read source file
            offsets = [0]; lines = []
            offsets,lines = [0],[]
            with open(filename,'r') as f:
                for line in f.readlines():
                    offsets.append(offsets[-1] + len(bytes(line,'utf-8')))
                    lines.append(line)

            offsets = offsets[:-1]

            # parse source file
            source = ''.join(lines)
            root_node = ast.parse(source,filename)

            # extract global variable definitions
            vardefs = []
            process_node(root_node,vardefs)

            # create TAGS file section
            sections.append("")
            for lineno,column,varname in vardefs:
                line = lines[lineno-1]
                offset = offsets[lineno-1]
                end = line.index('=') if '=' in line else -1
                sections[-1] += f"{line[:end]}\x7f{varname}\x01{lineno},{offset + column - 1}\n"

        # write TAGS file
        with open(tags_file,'w') as f:
            for filename,section in zip(filenames,sections):
                if section:
                    f.write("\x0c\n")
                    f.write(filename)
                    f.write(",")
                    f.write(str(len(bytes(section,'utf-8'))))
                    f.write("\n")
                    f.write(section)
                    f.write("\n")

        # make sure etags includes the newly created file
        etags_args += ['-i',tags_file]

    # now run the actual etags to take care of all other definitions
    try:
        cp = subprocess.run(['etags'] + etags_args,encoding='utf-8')
        status = cp.returncode
    except:
        status = 1

    # if etags did not finish successfully,remove the tags_file
    if status != 0:
        try:
            os.remove(tags_file)
        except FileNotFoundError:
            # nothing to be removed
            pass

与另一个答案一样,此脚本旨在替代标准 etags,因为它在内部调用了后者。因此,它也接受所有 etags' 命令行参数(但目前不尊重 -a)。

建议用别名修改自己的shell的init文件,例如在~/.bashrc中加入下面一行:

alias etags+=python3 -u /path/to/script.py

其中 /path/to/script.py 是保存上述代码的文件的路径。有了这样的别名,你可以简单地调用

etags+ /path/to/file

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。