微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

状态“正在准备”超过2个小时,可容纳350MB文件

如何解决状态“正在准备”超过2个小时,可容纳350MB文件

我已提交了在远程计算上运行的autoML(Standard_D12_v2-4个节点群集28GB,每个4个内核)

我的输入文件大约为350 MB。

状态为“正在准备”超过2个小时。然后失败。

User error: Run timed out. No model completed training in the specified time. Possible solutions: 
1) Please check if there are enough compute resources to run the experiment. 
2) Increase experiment timeout when creating a run. 
3) Subsample your dataset to decrease featurization/training time. 

下面是我的python笔记本代码,请帮忙。

import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.core.dataset import Dataset
from azureml.core.compute import ComputeTarget
from azureml.train.automl import AutoMLConfig



ws = Workspace.from_config()
experiment=Experiment(ws,'nyc-taxi')




cpu_cluster_name = "low-cluster"
compute_target = ComputeTarget(workspace=ws,name=cpu_cluster_name)


data = "https://betaml4543906917.blob.core.windows.net/betadata/2015_08.csv"
dataset = Dataset.Tabular.from_delimited_files(data)
training_data,validation_data = dataset.random_split(percentage=0.8,seed=223)
label_column_name = 'totalAmount'



automl_settings = {
    "n_cross_validations": 3,"primary_metric": 'normalized_root_mean_squared_error',"enable_early_stopping": True,"max_concurrent_iterations": 2,# This is a limit for testing purpose,please increase it as per cluster size
    "experiment_timeout_hours": 2,# This is a time limit for testing purposes,remove it for real use cases,this will drastically limit ablity to find the best model possible
    "verbosity": logging.INFO,}

automl_config = AutoMLConfig(task = 'regression',debug_log = 'automl_errors.log',compute_target = compute_target,training_data = training_data,label_column_name = label_column_name,**automl_settings
                            )




remote_run = experiment.submit(automl_config,show_output = False)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。