Amazon SageMaker Canvas FAQs
General
Q: What is Amazon SageMaker Canvas?
Amazon SageMaker Canvas is a no-code machine learning (ML) service. SageMaker Canvas supports the entire ML workflow including data preparation, model building and training, generating predictions, and deploying the models to production. With SageMaker Canvas, you can use ML to detect fraud, predict maintenance failures, forecast financial metrics and sales, optimize inventory, generate content, and more.
Q: How do I get started with Amazon SageMaker Canvas?
To access Amazon SageMaker Canvas, begin by creating a SageMaker Domain in the AWS Management Console. Once you have created a SageMaker Domain, you can access SageMaker Canvas in two ways. First, you can launch SageMaker Canvas directly from the AWS management console or you can launch SageMaker Canvas within SageMaker Studio, an integrated IDE for ML.
Once you log into SageMaker Canvas, you can try the interactive product tour that walks you through each step of the ML journey with easy-to-follow directions. Additionally, you can use sample datasets provided in SageMaker Canvas to help you get started with common use cases such as house price prediction, sales forecasting, predicting loan defaults, and more.
Q: Which SSO techniques are supported by Amazon SageMaker Canvas?
All Security Assertion Markup Language(SAML) 2.0 enabled SSO techniques are supported by SageMaker Canvas. Examples include AWS SSO, Active Directory, and Okta.
Q: How am I charged for Amazon SageMaker Canvas?
With SageMaker Canvas, you pay for what you use. There are three factors that determine your bill:
- Workspace instance (Session-Hrs): This is based on the number of hours you are logged into SageMaker Canvas or using SageMaker Canvas. The time starts when you launch SageMaker Canvas, and ends when you log out though the application or from an administrator.
- Model Training Charges
- Tabular models: The training charge for custom tabular models is based on the number of cells in the dataset used to train the model.
- CV and NLP models: The training charge for custom NLP and CV models is based on the amount of compute time it takes to train the model.
- Inference Charges
- Real-time endpoints: Deploying a model to real-time endpoints utilizes SageMaker resources and you are charged for your usage of these resources.
- Ready-to-use model usage: Usage of ready-to-use models to generate insights and extract information from documents, images, and text is rendered by AWS AI services. You are charged for the respective service powering the ready-to-use model that you use.
- Custom-model predictions: You are charged for compute used to produce single or batch predictions from trained models.
See the SageMaker Canvas pricing page for details.
Q: How do I control my costs and log out of Amazon SageMaker Canvas?
When you are logged into SageMaker Canvas, dedicated compute resources are made available to you which have an associated session rate per hour. To help control costs, you should log out of SageMaker Canvas when you are completed with your work for the day by clicking on the logout icon at the bottom of the left navigation panel. Alternatively, your administrator can log you out programmatically. Workspace instance (Session-Hrs) charges will be stopped once you log out or an administrator logs you out. Administrators can choose to use the programmatic approach on a fixed-time schedule or make use of a Amazon CloudWatch metric called TimeSinceLastActive to cause a dynamic logout after a desired idle period.
Q: In what regions is Amazon SageMaker Canvas available?
SageMaker Canvas is available in the US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Europe (Ireland), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Tokyo), and Australia (Sydney) AWS Regions.
Q: How can I encrypt my data and ML models with Amazon SageMaker Canvas?
SageMaker Canvas supports encryption at rest for datasets and ML models using customer managed keys (CMK) with AWS Key Management Service (KMS) for all use cases including classification, regression, and time-series forecast. You can use your own keys to encrypt the file systems on the instances used to train models and generate insights, and the model data in your Amazon S3 bucket.
Data preparation in SageMaker Canvas
Q: What data sources does Amazon SageMaker Canvas support?
SageMaker Canvas enables you to seamlessly discover AWS data sources that your account has access to including Amazon Simple Storage Service (S3), Amazon Athena (Glue Data Catalog), Amazon Redshift, Amazon Aurora, and Amazon RDS. SageMaker Canvas also supports external data sources including Salesforce Data Cloud, Snowflake, Databricks, and 40+ SaaS platforms. Finally, you can drag and drop files from your local disk to upload your dataset to SageMaker Canvas.
Q: What data types does Amazon SageMaker Canvas support?
SageMaker Canvas supports importing tabular (CSV, Parquet), image (JPEG, PNG), and document data (PDF, JPG, PNG, TIFF).
Q: How can I analyze and explore my data?
You can analyze and explore your data in SageMaker Canvas using pre-built visualization or use natural language to generate custom visualizations. Amazon SageMaker Canvas also providers a Data Quality and Insight report to verify data quality (such as missing values, duplicate rows, and data types) and detect anomalies (such as outliers, class imbalance, and data leakage) in your data.
Q: How can I prepare my data in Amazon SageMaker Canvas?
SageMaker Canvas offers a selection of over 300 prebuilt, PySpark-based data transformations, so you can transform your data and scale your data preparation workflow without writing a single line of code. Additionally, you can transform your data for ML models using FM-powered natural language instructions.
Q: How can I automate data preparation flows I’ve built in Amazon SageMaker Canvas?
You can launch or schedule a job to quickly process your data or export the data preparation flow as a processing step in your ML workflow using integration with SageMaker Pipeline.
Q: How can I validate my data to confirm it is ready to build a model?
SageMaker Canvas provides a Data Quality and Insight report in the data preparation flow to check data quality and estimate model accuracy. It also validates your data prior to model building to check for common issues.
Custom and ready-to-use models in SageMaker Canvas
Q: Does Amazon SageMaker Canvas support foundation models?
Yes, SageMaker Canvas provides access to ready-to-use foundation model (FMs) for content generation, text extraction, and text summarization. You can access FMs such as Claude 2, Amazon Titan, and Jurassic-2 (powered by Amazon Bedrock) as well as publicly available FMs such as Falcon and MPT (powered by SageMaker JumpStart) through a no-code and tune them using your own data.
Q: What ready-to-use models does Amazon SageMaker Canvas support?
SageMaker Canvas offers tabular, NLP, and for use cases including sentiment analysis, object detection in images, text detection in images, and entities extraction. These ready-to-use models do not require model building, and are powered by AWS AI services, including Amazon Rekognition, Amazon Textract, and Amazon Comprehend.
Q: What kinds of ML models can I create in Amazon SageMaker Canvas?
Currently, you can create classification (binary and multiple categories), regression, time-series forecasting, single label image classification, and multi-category text classification models in SageMaker Canvas.
Q: How can I build a model in SageMaker Canvas?
SageMaker Canvas provides multiple options to build a model.
- Preview: This option lets you preview your model in about 2 minutes to give you an indicator of the model accuracy and feature importance.
- Quick Build: This option allows you to build a model quickly (approximately between 2 and 20 minutes) and provides a ready-made model.
- Standard Build: This option is extensive and may take a few hours depending on the size of your dataset. Standard build models provide you with detailed information including metric scores, training experiments using different combinations of hyperparameters, and generates multiple models in the backend. It then picks the best model that you can evaluate and use.
Q: How can I explain my model to others?
SageMaker Canvas provides column impact analysis which explains the impact of each column in your dataset on a model. SageMaker Canvas also provides additional metrics that provide visibility into model performance. Additionally, when you generate predictions, you can see the column impact that identifies which columns have the most impact on each prediction.
Q: Can data scientists share models built outside of Amazon SageMaker so I can generate predictions on those models in Amazon SageMaker Canvas?
Yes. Data scientists can share any ML model built by other tools once it is registered in the SageMaker Model Registry, allowing you to generate predictions on these models in SageMaker Canvas.
Making predictions using SageMaker Canvas
Q: How do I make predictions?
To make a single prediction, go to the “single prediction” tab of the corresponding model version, input values, and SageMaker Canvas will show you the prediction. You can also use sliders and pull-down menus to change input values to see the impact on the prediction. To make predictions for multiple observations or rows of data, go to the “bulk prediction” tab, drag and drop the CSV, JPEG, or PNG file containing your observation, and SageMaker Canvas will create a new CSV, JPEG, or PNG file with predictions. SageMaker Canvas allows you to run manual and automatic batch predictions. Automatic batch prediction workflows get triggered every time an associated dataset is updated. You can then review prediction results inline or download for review.
Q: How can I build predictive dashboards with predictions from Amazon SageMaker Canvas?
You can select single or multiple batch predictions in SageMaker Canvas and send them to multiple Amazon QuickSight users in an account. You can open QuickSight with a single click from SageMaker Canvas, analyze the prediction as a dataset, build, and publish predictive dashboards that can be continuously updated for new and changed data.
Use SageMaker Canvas models with SageMaker Studio for MLOps
Q: Can I share models built in Amazon SageMaker Canvas with data scientists and collaborate with them?
Yes. You can share ML models built in SageMaker Canvas with data scientists working in SageMaker Studio. Data scientists can review, update, and share updated model versions with you, so you can generate predictions on the new versions in SageMaker Canvas.
After building and training a standard model in SageMaker Canvas, you can share your ML model using the share button in SageMaker Canvas. You can choose to share the model to a single user or multiple users within SageMaker Studio.
Q: Which ML model artifacts can be shared from Amazon SageMaker Canvas to Amazon SageMaker Studio?
The ML model and artifacts shared from SageMaker Canvas will contain the dataset, data transformations (including the recipe data flow and transformation code), list of candidate models and the recommended model, data exploration report, candidate definition notebook, and explainability metrics (including feature importance).
Q: Which artifacts can be edited and updated by data scientists?
Data scientists using SageMaker Studio can view model artifacts and recommend an alternate candidate from the list of candidates in SageMaker Autopilot. In addition, they can open and update data transformations with SageMaker Data Wrangler, update the model using SageMaker Autopilot, and share the new model version.
Q: How can SageMaker Studio users update models they receive from SageMaker Canvas?
SageMaker Studio users can send updates on model versions using the share button within SageMaker Studio. The updated model from SageMaker Studio will appear as a new version of the original shared model directly in SageMaker Canvas.
Q: How can I distinguish between my original shared model and a new shared model?
Updated models are automatically versioned within SageMaker Canvas. You can access different versions of the model through the drop-down menu within SageMaker Canvas.
Q: What use cases and model types can I share from Amazon SageMaker Canvas to Amazon SageMaker Studio?
You can share standard build models containing tabular data for all use cases within SageMaker Canvas, including customer churn, predicting home prices, sales forecasting, predicting loan defaults, predicting hospital bed occupancy, and time-series forecasting models. You can also share custom CV and NLP models.
Bring your own ML model
Q: Can I push ML models created in Amazon SageMaker Canvas to my existing MLOps CI/CD processes?
Yes. Once you create ML models in SageMaker Canvas, you can register them to SageMaker Model Registry and plug them into your existing model deployment CI/CD processes. SageMaker Model Registry is a repository that catalogs ML models, manages various model versions, associates metadata, manages the approval status of a model, and deploys them to production.
Q: How does model registration work in Amazon SageMaker Canvas?
When you select a model version in SageMaker Canvas and register it to SageMaker Model Registry in your own account, SageMaker Canvas automatically sends model artifacts to SageMaker Model Registry such as reference links to the model inference container, model feature importance reports, model metadata such as training metrics, and associated charts. Once the model is registered, you can track the approval status in SageMaker Canvas. Rejecting a model in SageMaker Model Registry prevents the model from being deployed into an escalated environment, whereas approving a model in the SageMaker Model Registry can trigger a model promotion pipeline. The model promotion pipeline automatically copies the model to your pre-production AWS account, and triggers that the model is ready for inference workloads.
Generative AI capabilities
Q: How does Amazon SageMaker Canvas support generative AI?
SageMaker Canvas offers ready-to-use foundation models (FMs) powered by Amazon Bedrock and SageMaker Jumpstart. These models enable you to generate and summarize content. You can use natural language instructions to perform tasks such as creating narratives, reports, and blog posts; answering questions; summarizing notes and articles; and explaining concepts, without writing a single line of code. Your data is not used to improve the base models, is not shared with third-party model providers, and stays entirely within your secure AWS environment. With the same no-code interface, you can upload a dataset and select an FM, and SageMaker Canvas automatically helps you build custom foundation models to generate predictions immediately. SageMaker Canvas also displays performance metrics, so you can collaborate easily to generate predictions using FMs and understand how well the FM is performing on a given task.
Q: What controls does Amazon SageMaker Canvas offer for foundation models?
SageMaker Canvas offers permissions for admins to control access to foundation models in the SageMaker Canvas user interface. This includes both features powered by foundation models such as data preparation and to ready-to-use foundation models.