如何解决如何使用 AllenNLP 和 coref-spanbert-large 在没有 Internet 的情况下解决共指?
想要使用 AllenNLP 和 coref-spanbert-large 模型在没有 Internet 的情况下解决共引用。 我尝试按照此处描述的方式进行操作https://demo.allennlp.org/coreference-resolution
我的代码:
from allennlp.predictors.predictor import Predictor
import allennlp_models.tagging
predictor = Predictor.from_path(r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz")
example = 'Paul Allen was born on January 21,1953,in Seattle,Washington,to Kenneth Sam Allen and edna Faye Allen.Allen attended Lakeside School,a private school in Seattle,where he befriended Bill Gates,two years younger,with whom he shared an enthusiasm for computers.'
pred = predictor.predict(document=example)
coref_res = predictor.coref_resolved(example)
print(pred)
print(coref_res)
当我可以访问互联网时,代码可以正常工作。 但是当我无法访问互联网时,我会收到以下错误:
Traceback (most recent call last):
File "C:/Users/aap/Desktop/CoreNLP/Coref_AllenNLP.py",line 14,in <module>
predictor = Predictor.from_path(r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz")
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\predictors\predictor.py",line 361,in from_path
load_archive(archive_path,cuda_device=cuda_device,overrides=overrides),File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\models\archival.py",line 206,in load_archive
config.duplicate(),serialization_dir
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\models\archival.py",line 232,in _load_dataset_readers
dataset_reader_params,serialization_dir=serialization_dir
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py",line 604,in from_params
**extras,File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py",line 632,in from_params
kwargs = create_kwargs(constructor_to_inspect,cls,params,**extras)
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py",line 200,in create_kwargs
cls.__name__,param_name,annotation,param.default,**extras
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py",line 307,in pop_and_construct_arg
return construct_arg(class_name,name,popped_params,default,line 391,in construct_arg
**extras,line 341,in construct_arg
return annotation.from_params(params=popped_params,**subextras)
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\from_params.py",line 634,in from_params
return constructor_to_call(**kwargs) # type: ignore
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\token_indexers\pretrained_transformer_mismatched_indexer.py",line 63,in __init__
**kwargs,File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\token_indexers\pretrained_transformer_indexer.py",line 58,in __init__
model_name,tokenizer_kwargs=tokenizer_kwargs
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\data\tokenizers\pretrained_transformer_tokenizer.py",line 71,add_special_tokens=False,**tokenizer_kwargs
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\allennlp\common\cached_transformers.py",line 110,in get_tokenizer
**kwargs,File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\models\auto\tokenization_auto.py",line 362,in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path,**kwargs)
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\models\auto\configuration_auto.py",line 368,in from_pretrained
config_dict,_ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path,**kwargs)
File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\configuration_utils.py",line 424,in get_config_dict
use_auth_token=use_auth_token,File "C:\Users\aap\Desktop\CoreNLP\corenlp\lib\site-packages\transformers\file_utils.py",line 1087,in cached_path
local_files_only=local_files_only,line 1268,in get_from_cache
"Connection error,and we cannot find the requested files in the cached path."
ValueError: Connection error,and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
Process finished with exit code 1
请说,我需要什么才能在没有互联网的情况下运行我的代码?
解决方法
您将需要转换器模型的配置文件和词汇表的本地副本,以便标记器和标记索引器不需要下载这些:
from transformers import AutoConfig,AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(transformer_model_name)
config = AutoConfig.from_pretrained(transformer_model_name)
tokenizer.save_pretrained(local_config_path)
config.to_json_file(local_config_path + "/config.json")
然后您需要将配置文件中的转换器模型名称覆盖到您保存这些内容的本地目录 (local_config_path
):
predictor = Predictor.from_path(
r"C:\Users\aap\Desktop\coref-spanbert-large-2021.03.10.tar.gz",overrides={
"dataset_reader.token_indexers.tokens.model_name": local_config_path,"validation_dataset_reader.token_indexers.tokens.model_name": local_config_path,"model.text_field_embedder.tokens.model_name": local_config_path,},)
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。