Please share with the community what you think needs improvement with Amazon SageMaker.
What are its weaknesses? What would you like to see changed in a future version?
The product has come a long way and they've added a lot of things, but in terms of improvement I would like to probably have features such as MLflow embedded into it. Additional features I would like to see would include, as mentioned, MLflow and ML Pipelines which are more of a feature rich support of machine learning pipelines as well as scheduling machine learning pipelines, and visualization of machine learning pipelines.
AI is a new area and AWS needs to have an internship training program available. This is one place where I see this solution lagging. There is high-level training available, but when you consider that people have been working with Windows, Linux, and various applications for the past 20 years, they know those products inside and out. SageMaker, on the other hand, is a completely new tool. It can be very hard to digest. AWS needs to provide more use cases for SageMaker. There are some, but not enough. They should collect or create more use cases and then distribute them free of charge to the customers. I would like to see a more graphical, low-code interface that can be used to customize SageMaker.
The interface and the IDE are in need of improvement. For example, including drag and drop functionality would be helpful. If the ETL can be made a little better then that would be good for us. The entire machine learning project flow, or data science project flow, can be a little better. It is good but it would benefit from more machine learning options, making it really good. Scalability to handle big data can be improved by making integration with networks such as Hadoop and Apache Spark easier. Adding some AI functionality, similar to what DataRobot or Azure AI has, would be really great.
The pricing is complicated and should be simplified. I would suggest that Amazon SageMaker provide free slots to allow customers to practice, such as a free slot to try out working with a Sandbox. This would be beneficial for newcomers, especially those who are getting into the cloud space. They could explore this area and get all of the aspects including data engineering, data recognition, and data transformation.
I would say the IDE is quite immature, but it is still in its infancy, so I expect it to get better over time.