如何解决使用CloudFormation模板运行搜寻器
此CloudFormation模板可以按预期工作,并创建本文所需的所有资源:
但是WorkflowStartTrigger
资源实际上并未运行搜寻器。如何使用CloudFormation模板运行搜寻器?
Resources:
MyRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Principal:
Service:
- "glue.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/"
Policies:
-
PolicyName: "root"
PolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action: "*"
Resource: "*"
MyDatabase:
Type: AWS::Glue::Database
Properties:
CatalogId: !Ref AWS::AccountId
DatabaseInput:
Name: "dbcrawler123"
Description: "TestDatabaseDescription"
LocationUri: "TestLocationUri"
Parameters:
key1 : "value1"
key2 : "value2"
MyCrawler2:
Type: AWS::Glue::Crawler
Properties:
Description: example classifier
Name: "testcrawler123"
Role: !GetAtt MyRole.Arn
DatabaseName: !Ref MyDatabase
Targets:
S3Targets:
- Path: 's3://nytaxi162/'
SchemaChangePolicy:
UpdateBehavior: "UPDATE_IN_DATABASE"
DeleteBehavior: "LOG"
TablePrefix: test-
Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}}}"
WorkflowStartTrigger:
Type: AWS::Glue::Trigger
Properties:
Description: Trigger for starting the Crawler
Name: StartTrigger
Type: ON_DEMAND
Actions:
- CrawlerName: "testcrawler123"
解决方法
您应该能够通过创建附加到lambda的自定义资源来做到这一点,从而使lambda实际上执行启动搜寻器的操作。您甚至应该可以让它等待搜寻器完成其执行
,CloudFormation 不直接运行爬虫,它只是创建它们。 但是你可以在定义触发器的同时创建一个时间表来运行一个爬虫:
ScheduledJobTrigger:
Type: 'AWS::Glue::Trigger'
Properties:
Type: SCHEDULED
StartOnCreation: true
Description: DESCRIPTION_SCHEDULED
Schedule: cron(5 * * * ? *)
Actions:
- CrawlerName: "testcrawler123"
Name: ETLGlueTrigger
如果需要在 CloudFormation 堆栈创建过程中运行爬虫,可以使用 Lambda。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。