Overview
Amazon SageMaker Canvas is a no-code machine learning service that enables businesses to build, train, and deploy highly accurate ML models using a visual interface without writing any code. SageMaker Canvas provides access to ready-to-use foundation models and pre-trained models for CV and NLP use cases. Additionally, you can import data from over 50 data sources, prepare data using built-in transforms or natural language, build custom models, and complete the ML lifecycle.
SageMaker Canvas follows a pay-as-you-go pricing model, where your bill is determined by five key factors: 1) the duration your SageMaker Canvas workspace instance is running, 2) data processing, 3) custom model training, 4) model prediction, and 5) usage of ready-to-use models. This flexible pricing model allows customers to only pay for the resources they consume. Additionally, by scheduling automated shutdown of SageMaker Canvas workspace instances when not in use, you can further optimize costs.
SageMaker Canvas Pricing Structure
1. Workspace instance (Session-Hrs)
A workspace instance is dedicated for your use when you are logged into SageMaker Canvas. You pay based on the number of hours for which SageMaker Canvas is used or logged into. The time starts when you launch the SageMaker Canvas application, and ends either when you log out from the SageMaker Canvas interface or when your administrator ends your SageMaker Canvas application from the AWS management console. Logging out of SageMaker Canvas stops Workspace instance charges.
Workspace instance (Session-Hrs) charges
$1.9/hour
2. Data processing charges
SageMaker Canvas supports processing of tabular, time-series, structured text and image data, and it offers data processing of up to 5GB of data in the workspace instance at no additional cost. For datasets larger than 5GB, the Data Wrangler capability in SageMaker Canvas leverages Amazon EMR Serverless, an auto-scaling technology that enables efficient data preparation for tabular, time-series, and structured text data. When working with datasets larger than 5GB, or explicitly choosing to use EMR Serverless, you're charged based on the aggregate vCPU, memory, and storage resources consumed by the EMR Serverless workers, from the start to the end of the worker's runtime. The charges are calculated by rounding up to the nearest second, with a 1-minute minimum charge. Pricing varies by region and instance type, with more details available on the Amazon EMR Serverless pricing page. Alternatively, you can choose to run your data processing jobs using SageMaker Processing across any data size, for which you can consult the SageMaker pricing page for applicable rates based on ML compute instances, data processed, and storage used. The exact processing charges when running your Data Wrangler flow in your entire dataset with either EMR Serverless, or SageMaker Processing will depend on your data size and the type of transforms you have selected, which can heavily influence the compute requirements. The below table provides estimated charges based on approximate time it takes to import and sample a dataset using EMR Serverless. The actual time and charges to export the data will vary based on exact size of the dataset and the type of transforms applied.
Data size to import and sample (random or stratified) | Estimate charge |
<5 GB | $0 |
5 - 100 GB | $0.09 -$1.5 |
100 - 500 GB | $2 - $3.8 |
500 GB - 1 TB | $4 - $12 |
3. Custom model training charges
SageMaker Canvas supports automated model building (AutoML) for a variety of tasks, including tabular data (regression and classification), time-series forecasting, image and text classification (computer vision and natural language processing), and fine-tuning large language models. It validates the prepared data, leverages Amazon EMR Serverless for efficient data preparation when needed, triggers SageMaker Autopilot for custom model exploration and training, and generates a model leaderboard to visualize results, model scores, feature importance, and insights.
3.1 Tabular and Time-series models
SageMaker Canvas supports a wide range of machine learning tasks for tabular models, including numeric prediction (regression), binary classification, multi-class classification, and time-series forecasting.
For tabular datasets up to 5GB and time-series datasets up to 30GB, Canvas leverages SageMaker Training instances. You will be charged based on the instance hours used for model training in Amazon SageMaker. Canvas automatically selects appropriate instance types, such as ml.m5.12xlarge, ml.c5.18xlarge, and ml.m5.4xlarge, based on the dataset size, performance, and availability. Refer to the SageMaker instance pricing page for more information.
For tabular datasets exceeding 5GB and time-series datasets exceeding 30GB, SageMaker Canvas utilizes Amazon EMR Serverless to efficiently downsample and prepare the data. Subsequently, it leverages SageMaker instances for model training. You will be charged based on the EMR Serverless pricing model for the data processing step (mentioned in the 2. Data processing charges), and based on the Amazon SageMaker instance pricing for the model training phase.
The following table provides an estimate of the training cost for a standard build, based on dataset size and the SageMaker and EMR Serverless instance hours used. Please note that these figures are approximate, and actual hours and costs may vary.
Data Size | Estimate EMR Serverless charges | Estimate SageMaker Training Instance charges | Total estimate charge |
<100 MB | 0 | $2.3 - $9.2 | $2.3 - $9.2 |
100 MB - 1 GB | 0 | $9.2 - $13.8 | $9.2 - $13.8 |
1 - 5 GB | 0 | $13.8 - $18.8 | $13.8 - $18.8 |
5 - 100 GB | $221 - $276 | $18.8 - $27.5 | $240 - $303.5 |
100 - 500 GB | $276 - $387 | $18.8 - $27.5 | $295 - $415 |
500 GB - 1 TB | $387 - $497 | $18.8 - $27.5 | $406 - $525 |
3.2 CV & NLP models
SageMaker Canvas supports 2 and 3+ category prediction (binary and multi-class text classification and image classification) for custom NLP and CV models. The training charge for custom NLP and CV models is based on the amount of time it takes to train the model. SageMaker training instances are used to render the model training service and you will be charged directly from SageMaker. Based on the instances used from SageMaker Canvas, the training price will range from $2.03 - $4.89 per hour of training time. For further pricing information, please see SageMaker Pricing.
The following table provides estimated custom CV model training charges based on image resolution of 640 x 480 pixels. The estimates use a SageMaker instance price for ml.g4dn.12xlarge of $4.89/h.
Number of images | Estimate charge |
100 | $1.62 |
250 | $1.63 |
500 | $1.65 |
1,000 | $1.68 |
5,000 | $1.97 |
10,000 | $2.33 |
50,000 | $5.19 |
The following table provides estimated custom NLP model training charges based on an average of 240 unicode characters per cell. The estimates use the SageMaker instance price for ml.g4dn.12xlarge of $4.89/h.
Number of cells | Estimate charge |
100 | $3.01 |
500 | $3.11 |
1,000 | $3.24 |
5,000 | $4.22 |
10,000 | $9.98 |
50,000 | $15.25 |
Note: Training times and charges are subject to variance based on a number of factors including image resolution for CV, number of characters per sequence for NLP, and number of categories.
3.3 Fine-Tuning Foundation models
SageMaker Canvas supports fine-tuning of foundation models (FMs) if you have a specific use-case and would like to customize the model responses based on your own data. Canvas leverages SageMaker Training instances to fine-tune FMs. You will be charged based on the instance hours used for model fine-tuning in Amazon SageMaker. Canvas automatically selects appropriate instance types such as ml.g5.8xlarge, ml.g5.24xlarge, and ml.g5.48xlarge. This instance selection is based on the availability of the instances in those regions. Refer to the Amazon SageMaker instance pricing page for more information.
4. Model prediction charges
In SageMaker Canvas, you can perform real-time or batch inference for deploying and making predictions with your trained models. The charges for model predictions vary based on the type of inference and the size of the dataset.
Real-time Inference:
When you deploy a Canvas model for real-time inference, you are charged for the usage of the specific Amazon SageMaker instance type on which the model is hosted. The pricing for real-time inference is based on the Amazon SageMaker Pricing for Hosting: Real-Time Inference, which depends on the instance type and duration of usage.
Batch Inference:
For batch predictions, the charges depend on the type of model and the size of the dataset. More details on how batch transform across data type is priced can be found below:
4.1 Tabular Models
Batch predictions with numeric prediction, binary classification, and multi-class classification custom tabular models on datasets up to 5GB run within the SageMaker Canvas application, free of additional charges.
If you have a tabular dataset larger than 5GB, the batch prediction process will leverage Amazon EMR Serverless for data processing and Amazon SageMaker Batch Transform for generating predictions. In this case, you will be charged based on the EMR Serverless pricing model for the data processing step and the SageMaker Batch Transform pricing for the prediction generation.
Data Size | Estimate EMR Serverless charges | Estimate SageMaker Batch Tranform Instance charges | Total estimate charge |
0 - 5GB | 0 | 0 | $0 |
5- 100 GB | $13.9 - $42.3 | $14 - $34 | $27.9 - $76.3 |
100 - 500 GB | $42.3 - $90.3 | $34 - $91 | $76.3 - $181.3 |
500 GB - 1 TB | $90.3 - $181 | $91 - $182 | $181.3 - $363 |
4.2 Time-series forecasting models
With time-series forecasting models, you can generate either single or batch predictions and you will be . For predictions with time-series forecasting, charges apply for Amazon SageMaker Asynchronous Inference, Amazon SageMaker Batch Transform, or both.
For single prediction, SageMaker Asynchronous Inference charges apply, with a minimum of 2 hours. Depending on your region, charges may range from $0.408 to $0.533 per hour. The charges stop automatically after two idle hours.
For batch prediction, SageMaker Batch Transform charges apply based on the amount of time it takes to generate your predictions. The table below estimates charges based on the number of time-series observed in your data.
Data Size | Estimate EMR Serverless charges | Estimate SageMaker Batch Tranform Instance charges | Total estimate charge |
0 - 5GB | $0.5 - $0.8 | $0.75 - $1.13 | $1.25 - $2.03 |
5- 100 GB | $0.8 - $18 | $1.13 - $27 | $2.03 - $45 |
100 - 500 GB | $18 - $81 | $27 - $137 | $45 - $218 |
500 GB - 1 TB | $81 - $160 | $137 - $261 | $218 - $421 |
For detailed SageMaker pricing, please see SageMaker Pricing.
4.3 CV and NLP models
The prediction charge for custom CV and NLP models is based on the amount of time it takes to generate your predictions. SageMaker instances, priced at $0.408 per hour of prediction generation time, are used to render the model prediction and you will be charged directly from SageMaker. For further pricing information, please see SageMaker Pricing.
For example, the estimated charge to generate a prediction for 1,000 images of resolution 640 x 480 is $0.03. Similarly, the estimated charge to generate a prediction for 1,000 sequences of 520 unicode characters per sequence is $0.01.
5. Ready-to-use model charges
SageMaker Canvas offers access to a wide range of foundation models from Amazon Bedrock and Amazon SageMaker JumpStart, and pre-trained models from Amazon Rekognition, Amazon Comprehend, and Amazon Textract for CV and NLP use cases.
For content generation, extraction, and summarization using foundation models (FMs) from Amazon Bedrock, you will be charged based on the volume of input and output tokens. For more information, see Amazon Bedrock pricing. SageMaker JumpStart FMs are deployed on SageMaker instances, and you will be charged for the duration the selected instance type is running. Refer to the Amazon SageMaker Pricing for Hosting: Real-Time Inference for more information.
Requests for object detection and text detection in images using Amazon Rekognition are charged based on the number of images in your dataset. For specific pricing details, please consult the Amazon Rekognition pricing page.
Requests for sentiment analysis, entity extraction, language detection, and personal information detection using Amazon Comprehend are measured in units of 100 characters, and you will be charged according to the number of units in your dataset. Refer to the Amazon Comprehend pricing page for further pricing information.
Requests for expense analysis, document analysis, and identity document analysis using Amazon Textract are measured in units of 1,000 pages, and you will be billed based on the number of units in your dataset. Consult the Amazon Textract pricing page for detailed pricing information.
Canvas free tier
Amazon SageMaker Canvas provides a 2-month free tier. The free tier includes workspace instance (Session-Hrs) usage up to 160 hours/month for using the SageMaker Canvas application.
Ready-to-use NLP, CV, foundation models are rendered by Amazon Rekognition, Amazon Comprehend, Amazon Textract and Bedrock. Each service offers a varying free tier duration and coverage. To learn more, please see the respective AWS service price pages: Amazon Rekognition, Amazon Comprehend, Amazon Textract, and Amazon Bedrock.
Pricing examples
Example 1:
Let’s say you have a team of 4 analysts who want to try SageMaker Canvas. Let’s say one of them builds a numeric prediction model to predict on time delivery of packages, using a 50 MB input dataset. SageMaker Canvas used 2.9 instances hours of the ml.m5.12xlarge type to train the model. Through this process, the team is logged into SageMaker Canvas for 10 hours per week per user. The time is spent exploring data, preparing datasets, and generating predictions, translating to 40 hours per month per user, or 160 hours total usage. The bill at the end of the month would be calculated as follows:
Workspace instance (Session-Hrs) charges under free tier up to 160 hours/month: $0.00
Model training charges: $2.765/hour x 2.9 = $7.69 ( 50 MB input dataset)
Total: $7.69
Example 2:
Let’s say that after you consumed the free tier, your team continues to use SageMaker Canvas. You build a numeric prediction model using input date set of 150 MB. SageMaker Canvas used 10 instance hours of the ml.c5.18xlarge instance types to train the model. Throughout this process the team is logged into SageMaker Canvas and spends 40 hours in SageMaker Canvas during one month to explore data, join datasets, and run predictions. The bill at the end of the month would be calculated as follows:
Workspace instance (Session-Hrs) charges: $1.9 x 40 = $76
Model training charges: $3.672/hour x 11 = $36.72
Total: $112.72
Example 3:
Let’s say after you consumed the free tier, you build a custom CV classification model to detect manufacturing defects in images and you use a training dataset of 1,000 images. Your training time is approximately 21 minutes and the price point is $4.89 per hour. During the process, you spend 4 hours in SageMaker Canvas to label the images in the training dataset, view the explainability heat map, and understand the model accuracy. You then run predictions which take approximately 12 minutes at a price point of $0.408 per hour. The bill would be calculated as follows:
Workspace instance (Session-Hrs) charges: $1.9*4 = $7.60
Model training charges: $4.89/hour x 21 mins x 1/60 = $1.68
Predictions: $0.408/hour x 12 mins x 1/60 = $0.08
Total: $9.36
Example 4:
Let’s say after you consumed the free tier, you build a custom NLP model to understand user sentiment in reviews and you use a training dataset of 6,700 reviews at an average of 120 characters per review and you use your model to generate predictions on 1,000 reviews. Your training time is approximately 31 minutes and the price point is $3.825 per hour, and the time to generate a prediction is 4.1 minutes and the price point is $0.408 per hour. During the process, you spend 2 hours in SageMaker Canvas to label the reviews in the training dataset and view the prediction results. The bill would be calculated as follows:
Workspace instance (Session-Hrs) charges: $1.9*2 = $3.80
Model training charges: $3.825/hour x 31 mins x 1/60 = $1.98
Predictions: $0.408/hour x 4.1 mins x 1/60 = $0.03
Total: $5.81
Example 5:
Let’s say after you consumed the free tier, you want to extract information from 50 identity documents. During the process, you spend 1.5 hours in SageMaker Canvas to import your documents and view your results. The bill would be calculated as follows:
Workspace instance (Session-Hrs) charges: $1.9*1.5 = $2.85
Ready-to-use model charges (as per Amazon Textract pricing): The pricing per page in the US West (Oregon) Region for the first 100,000 pages is $0.025 per page. The charge would be $0.025 x 50 = $1.25
Total: $4.10
Example 6:
Let’s say that after you consumed the free tier, you build a custom time-series forecasting model to predict product demand. You own a clothing company with 1,000 items sold in 50 stores worldwide and are forecasting product demand for the next 12 weeks. You used a 200 MB dataset that includes weekly sales from the past year and information on two additional attributes: price and marketing spend. SageMaker Canvas uses 3 SageMaker Training instance hours of ml.m5.12xlarge instance types to train the model. After the model is built, you spend 30 minutes performing 'what-if' analysis with single predictions, which use SageMaker Asynchronous Inference on an ml.c5.2xlarge instance that SageMaker Canvas automatically stops after two idle hours. Subsequently, you generate batch predictions of a 12-week forecast horizon, requiring 3 SageMaker Batch Transform hours on ml.m5.12xlarge instances. Throughout this process, your team is logged into SageMaker Canvas, spending 10 hours during the month to explore data, join datasets, and run predictions. The bill at the end of the month would be calculated as follows:
Workspace instance (Session-Hrs) charges: $1.90 x 10 = $19
Model training charges: $2.765/hour x 3 hrs = $8.30
Single Predictions: $0.408 x (30 mins of usage + 2 hrs idle time) = $1.02
Batch Predictions: $2.765/hour x 3 hrs = $8.30
Total = $36.62
Example 7
Let's say after you consumed the free tier, you want to build a custom tabular classification model to predict customer churn using a large dataset of 500 GB. The dataset includes customer demographics, usage patterns, and subscription details, and you would like to use Data Wrangler to prepare your data first. During dataset import in Data Wrangler using random or stratified sampling, SageMaker Canvas leverages Amazon EMR Serverless to import and downsample the large dataset in order to help you build a data flow interactively. Once your sample is imported, you can visualize and understand your data and add some transform steps in your data flow. Once you are ready to trigger an model build (with AutoML), you can then click to “create a model”. At the moment of export from Data Wrangler to SageMaker Canvas Model Build phase, your data flow will run on your entire dataset to apply the transforms you added in your flow using EMR Serverless. The data transformations and the size of your data will determine how long the EMR Serveless will run for and how much compute it will allocate. It will then store the entire processed dataset in SageMaker Canvas datasets. It will also create in “draft” state the model name you prescribed, which you can then configure and trigger a Build (Quick or Standard). This process will downsample your dataset and run additional data preparation steps as required by SageMaker Autopilot in EMR Serverless. Once downsampling and additional data preparation finishes, model exploration will start. This process uses SageMaker instances, and you will be charged based on the instance hours used for model training. The exact instance hours will depend on the number of columns and column types in your dataset. Once the model is built, you can evaluate the model metrics, compare models, and generate batch predictions on your test dataset (assuming 10% of the dataset for testing will be 50GB). The batch prediction process will use EMR Serverless to chunk the data in smaller batches, and then will generate predictions using SageMaker Batch Transform. Lastly, for the whole end-to-end model building process, your team will probably spend a total of 20 hours in SageMaker Canvas, exploring the data, reviewing the model leaderboard, and analyzing the feature importance and model explainability. The bill at the end of the month would be calculated as follows, but please note the actual charges may vary based on the transformations that you added in your Data Wrangler flow, on your dataset characteristics (size, number of columns, column types, etc.):
Workspace instance (Session-Hrs) charges: $1.90 x 20 = $38
Data import and sampling charges: (EMR Serverless) = $3.8
Data processing charges (EMR Serverless): time and cost will depend on the transforms you picked
Model training charges (EMR Serverless for downsampling and SageMaker instances for training): $406
Batch predictions charges (EMR Serverless for chunking and SageMaker instances for predictions): $38
Total: $485.8 (without accounting for the data processing charges which depend on the selected transforms)