×

Create a Self-Service Sagemaker with AWS Service Catalog

As part of the Data lake I’m working on Sagemaker instances for Data Analysts to run analytics jobs. These instances are required on demand, in fact when Analysts are crunching or exploring data. One solution could be to create these instances upfront, at the very start of the job, and clean up the instances once the job has been completed. Yet these instances incur costs, even when stopped thru EBS volumes. Also, OS patches aren’t applied in a stopped state.

We want the Analysts be able to create those instances, as big as they need them, when they need them. Not completely by free clicking around of course. The Service Catalog in the AWS Console provides just the capability to do this properly. And so, allow the users, the Data Analysts, to interact with the resources they need. I provided full guidance to the Analysts by describing the infra and properties required by Sagemaker. By storing the settings in a script I allowed the Analysts in a specific role to start up an instance with one single click. This is exactly where AWS Service Catalog comes in place, running that stored CloudFormation script shipped as CloudFormation Product.

Portfolio, Product & Portfolio Product Association

The following snippets describe how I made the Sagemaker part in the Data lake work. The Cloudformation is stored in a seperate S3 Bucket. After creating user wil be able to launch the stack.

DatalakePortfolio:
  Type: "AWS::ServiceCatalog::Portfolio"
  Properties:
    Description: "A portfolio of self-service Datalake."
    DisplayName: "Datalake Portfolio"
    ProviderName: "Datalake"

SagemakerProduct:
  Type: AWS::ServiceCatalog::CloudFormationProduct
  Properties:
    Description: "Sagemaker Product"
    Distributor: "Datalake"
    Name: "Sagemaker"
    Owner: "binx"
    ProvisioningArtifactParameters:
    - Description: "Initial version"
      DisableTemplateValidation: False
      Info:
        LoadTemplateFromURL: "https://s3.amazonaws.com/<<BucketName>>/products/sagemaker-instance.yml"
      Name: "v1"
    SupportEmail: "datalake@binx.nl"
    SupportUrl: "https://confluence.binx.nl/display/DLAKE/Service+Catalog"

SagemakerPortfolioProductAssociation:
  Type: "AWS::ServiceCatalog::PortfolioProductAssociation"
  Properties:
    PortfolioId:
      Ref: DatalakePortfolio
    ProductId:
      Ref: SagemakerProduct

Portfolio Principal Association

Attach the Portfolio to the previously created Principal. This can be an IAM User, Group or Role. In our case it is bound to SAML authenticated IAM role that a Data analyst gets.

DatalakePortfolioPrincipalPUAssociation:
  Type: "AWS::ServiceCatalog::PortfolioPrincipalAssociation"
  Properties:
    PortfolioId:
      Ref: DatalakePortfolio
    PrincipalARN: "${PrincipalARN}"
    PrincipalType: "IAM"

Portfolio Constraint

To Launch and Tear Down the product we need a Role for Service Catalog to assume. This role has all the policies that are needed to create and remove the Product. With the LaunchRoleConstraint we bound a Product/Portfolio combination to a Role.

SagemakerLaunchRoleConstraint:
  Type: AWS::ServiceCatalog::LaunchRoleConstraint
  Properties:
    Description: "Constraint to run Sagemaker and S3 in Cloudformation."
    PortfolioId:
      Ref: DatalakePortfolio
    ProductId:
      Ref: SagemakerProduct
    RoleArn:
      Fn::GetAtt: [ LaunchConstraintRole, Arn ]
  DependsOn: [ DatalakePortfolioPrincipalAssociation, LaunchConstraintRole ]

LaunchConstraintRole:
  Type: “AWS::IAM::Role”
  Properties:
    Path: “/“
    AssumeRolePolicyDocument:
      Version: “2012-10-17”
      Statement:
        - Effect: “Allow”
          Principal:
            Service: “servicecatalog.amazonaws.com”
          Action: “sts:AssumeRole”
    Policies:
      - PolicyName: “AllowProductLaunch”
        PolicyDocument:
          Version: 2012-10-17
          Statement:
            - Resource: ‘*’
              Effect: “Allow”
              Action:
                # Permissions required for the provisioning of the Sagemaker
                - cloudformation:GetTemplateSummary
                - s3:GetObject
                - sagemaker:*
                - s3:*
                - iam:Get*
                - iam:PassRole
                - ec2:*
                - kms:*
            - Resource:
                - “arn:aws:iam::*:role/SC-*”
                - “arn:aws:sts::*:assumed-role/datalake-service-catalog-LaunchConstraintRole-*”
                - “arn:aws:iam::*:role/sagemaker-notebook-iam-role”
              Effect: “Allow”
              Action:
                - iam:*
            - Resource:
                - “arn:aws:cloudformation:*:*:stack/SC-*”
                - “arn:aws:cloudformation:*:*:changeSet/SC-*”
              Effect: “Allow”
              Action:
                # Permissions required by AWS Service Catalog to create stack
                - cloudformation:CreateStack
                - cloudformation:DeleteStack
                - cloudformation:DescribeStackEvents
                - cloudformation:DescribeStacks
                - cloudformation:SetStackPolicy
                - cloudformation:ValidateTemplate
                - cloudformation:UpdateStack

Please use this link to download the full scripts to hit the ground running. Make sure you check out IAM policies before using for real.

Conclusion

Cloud infrastructure should be in code. We all know that. The smallest manual input tends to break a lot and makes things unpredictable. We have all dealt with that. But sometimes the AWS Console provides a nice way for users to interact with Cloud resources. The use of the AWS Service Catalog allows us to create resources on demand but still have them grouped in a defined, structured way. Thus, allowing us to provide users freedom to work when they want to, and with a variety of products. And at the same time, savings costs.

Picture of Thijs de Vries
Thijs de Vries
Cloud Consultant