2020-01-09 17:10:00 UTC

Which are the best end-to-end data science platforms?


I have experience working as a senior integration architect for AI/ML enablement for a manufacturing company with 10,000+ employees.  

We are currently evaluating data science platforms. Which vendor offers an end-to-end solution that really works from features management to model deployment? 

Thanks! I appreciate the help.

Guest
1313 Answers

author avatar
Top 5LeaderboardConsultant

There is a lot of vendors that offers their data science platforms, but it depends on of what you call end-to-end vendors and if you write the Word really, well makes me think that you already test many of them. Data science platforms came from a variety of vendors like IBM, SAP, Microsoft, Domino Data labs, RapidMinder among others. First I suggest that you have a person or team ready to test these solutions, if not, remember to prepare some profiles with skills of programming and process design.

My recommendation is if you already work with IBM ask for their Data Science experience. In other case my suggestion is to try RapidMiner that seems to be very useful with a fluid interface for model deployment and could try Sas Enterprise Miner as the top of the model building and model deployment and appears as one of the leaders of these platforms.

I hope this was useful and regards.

2020-01-10 16:29:28 UTC
author avatar
Top 5LeaderboardReal User

KNIME or Alterxy is a good choice for a company to deploy AI applications.

It has:

1. light data processing like ETL,

2. AI modeling develop and deploy,

3. and output simple charts or output to databases for further use like API/BI/etc.

If you deploy in the cloud, you can also use the AWS Sagemaker or other cloud tools.

2020-01-10 08:45:29 UTC
author avatar
Top 5LeaderboardReal User

There are many vendors offering end to end deployment with pros and cons. You can evaluate based on :
- On-prem vs cloud requirement
- Data volume that you want to process
- Do you already have ETL processes in place to extract the relevant data from diff sources?
- How are you planning to consume your ML output (API/dashboard/reports, etc)?
- Lastly, your ML algorithms that you intend to use and whether analyzing structured or unstructured data or both.

If you need further details, I will ask my presales to get in touch with you. Please provide me your contact information
.

2020-01-10 07:48:10 UTC
author avatar
Top 20User

DataRobot for OnPrem
SageMaker for AWS

2020-01-10 07:37:16 UTC
author avatar
User

The current issue today with the majority of DS platforms is they are based on disparate open-source libraries, or you need 5-6 different tools to build your end-to-end ML workflow, most have never seen production either.

At BigML we've been around for 10+ years were the first to market with an MLaaS platform and can help you and your team accomplish true end-to-end ML (source > dataset> model > predictions > production) all in a singular platform, we work with many clients in your space, and would be happy to talk with you. You can even sign up for our platform for free and take it for a spin.

2020-02-28 16:59:46 UTC
author avatar
User

One potential solution might be the SAS platform https://www.sas.com/en_us/software/platform.html

2020-01-13 11:16:43 UTC
author avatar
User

As others have said, many options but add Dataiku, H2Oi, Alteryx, and Databricks to your list.

2020-01-11 02:28:04 UTC
author avatar
User

Check out our system at Novi.Systems. It's an entirely integrated platform that includes hardware and software that performs what you require and much more. We'd be glad to set up a demo for you that allows you to load your data and "test drive" all the capabilities for up to four weeks. Contact me at mike@novi.systems

2020-01-10 17:22:46 UTC
author avatar
Top 5LeaderboardReal User

Please check for H2Oi, AzureML, Tensorflow.

2020-01-10 16:28:48 UTC
author avatar
Top 5LeaderboardReal User

For "end-to-end" platform for data science, I would prefer KNIME.

I think KNIME is especially better in working with various sources of data and preprocessing, easier to modify/add/remove flows from time to time when situations are changed.

For analytic, I have 50% of chance using KNIME nodes, and another 50% to code in Python node. Anyway it gives flexibility that you can write your own codes (I don't write R). And things are much simpler when data is well preprocessed.

It also provide data visualisation nodes, good enough but for fancy presentation, you will want to try others like Tableau.

Therefore it is easy to scale up as KNIME can nicely simplify the process before preprocessing.

2020-01-10 09:09:24 UTC
author avatar
Top 20User

I would suggest having working sessions for Data Robot (if your implementation is on-prem). SageMaker is what I would recommend if you plan for AWS.

2020-01-10 07:31:43 UTC
author avatar
Top 20LeaderboardConsultant

If you want to perform some ETL along with feature management and model deployment then I would recommend Alteryx + Data Robot

2020-04-08 09:59:47 UTC
author avatar
Top 5LeaderboardConsultant

The best data science platform is the one you try to fits best to fulfill all your requirements and that is the goal you want to reach, the data you have for use into the platform and the results that you wanted to have accordingly with your goals. So there is a lot of tools to use but my suggestion is to try those that is the most accepted if you do not work with one specific vendor. So try with RapidMiner, SAS Enterprise Miner, KNIME or Alterxy.

2020-01-15 15:06:42 UTC
Find out what your peers are saying about Alteryx, Databricks, Knime and others in Data Science Platforms. Updated: May 2020.
417,925 professionals have used our research since 2012.