Data Scientist - Upwork at Freelancer
Real User
Straightforward and easy-to-use, but not as easy-to-use as other solutions
Pros and Cons
  • "Working with complicated algorithms in huge datasets is really easy in Weka."
  • "Within the basic Weka tool, I don't see many tools that are available where we can analyze and visualize the data that well."

What is our primary use case?

I work a lot with university students.

One of the latest projects I did was related to a classification problem. I had to use different algorithms such as neural networks, Support Vector Machines, nearest neighbor algorithm, decision trees — those types of different algorithms in order to do the machine learning parts. 

I can't remember the exact data set that I recently worked with, but when it comes to machine learning and data mining, I have worked with different data sets. I use many algorithms in Weka in order to train and test those data sets.

How has it helped my organization?

In one circumstance, a client of mine wanted to cluster their data into different classes in order to identify their different values. I used the given data set that I've mainly preprocessed using Weka, then I was able to identify valuable clusters for themselves. The clustering was very useful for them; I could identify the different features and the traits of those clusters and communicate my results to the customer. It was very useful to them.

What needs improvement?

More accurate documentation should be published by the Weka company — that would be really helpful. When it comes to data visualization, I think there are lots of ways in which the data could be visualized, like pie charts. There are many more, but within the basic Weka tool, I don't see many tools that are available where we can analyze and visualize the data that well. If they could improve that area, I think it would be really good. They should focus more on data visualization, that would be really great as I have experienced many issues relating to this.

For how long have I used the solution?

I learned Weka during my MSC, around two years back. From time to time, I used it for different projects, data visualization, machine learning, and using different algorithms through Weka. That's an experience that I have gained. Actually, many of the projects that I have done have been through Upwork.

Buyer's Guide
Weka
March 2024
Learn what your peers think about Weka. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,857 professionals have used our research since 2012.

What do I think about the stability of the solution?

Stability-wise, it's good. The main issue that I have is related to the output. If everything could be more dynamic, and if the visualization, the final output, was better, then we would be able to gain a lot more from Weka — It would be more powerful like Python and other languages, as well. As a tool, it would be great. It's a stable environment, but I think proper documentation, if available, is needed; that would be great. 

What do I think about the scalability of the solution?

When it comes to capacity, I'm not too experienced with handling large numbers of data in Weka, so I can't really comment on the scalability.

How are customer service and support?

The technical support isn't that great. On a scale from one to ten, I would give their support a rating of five to six.

I have very little experience when it comes to requesting support with Weka's official site. The support has been good, but it hasn't been quick — it takes some time. Generally speaking, with platforms such as Stack Overflow, the customer service is not that great.

Which solution did I use previously and why did I switch?

Currently, I am also using Tableau, SPSS, Python, TensorFlow, and a couple of other machine-learning platforms. 

Compared to Weka, there are thousands and thousands of materials available in Python and R Programming. Their support teams are great and if you have questions, you'll get answers very quickly. Python is compatible with many other platforms as well, for example, you can use TensorFlow. You can go very deep into neural networks and everything can be implemented in programming languages, such as Python and R.

When it comes to Weka, I have not seen very deep neural networks — that kind of stuff is very complicated. It can be done, but it's very complicated. It's much easier with Python. That is one of the main differences that I've seen. I feel like Python is more popular than R Programming, but either way, we have the ability to do the same stuff with both programming languages. Overall, I feel like Python is easier to work with.

How was the initial setup?

Installing Weka is not that hard, it's really easy. Loading the data set into the Weka tool, and analyzing it is a bit tricky in the beginning, but when you're used to it, it's not too hard. We can easily use different classification algorithms, and we can train the data sets using those classification algorithms and save them. Then, we can easily use those models to test the data sets again. So, it's not that hard, it's easy. That's something good that I have experienced in Weka; setting up is also really easy, it's not hard at all.

Overall, it takes roughly 15 minutes to set up this solution.

Sometimes it can be a bit hard to identify the proper documentation packages to install into Weka. If that could be improved, it would be really great.

What about the implementation team?

Typically, I have my own implementation strategy that I follow; however, I would like more experience in this area.

I am looking forward to learning more about deploying these big concepts in cloud environments — enterprise applications as well. I haven't had the chance to do that yet but I am looking forward to getting into deeper areas related to Weka.

What's my experience with pricing, setup cost, and licensing?

Currently, I am using an open-source version so I don't know much about the price of this solution.

What other advice do I have?

The basic configuration is very easy. Compared to writing code in Jupyter Notebook, it's really easy to handle and work with very complicated algorithms in Weka. There are some steps that are not very simple, but overall, it's very easy. It's easy to load data and implement different algorithms with Weka. From my experiences so far, that's the basic advantage with Weka — it's easy to use, easy to handle, and once you learn it, it's not that hard to work with.

Working with complicated algorithms in huge datasets is really easy in Weka. Training datasets is equally easy and it's quite speedy as well — the same goes implementation-wise. Without writing immeasurable amounts of code, we can quickly perform machine learning using Weka. That's the main advantage of Weka.

Overall, on a scale from one to ten, I would give Weka a rating of six.

If they improved the visualization issues, the documentation issues, and the implementation capabilities, I would give them a higher rating. According to my knowledge, there are not any boundaries when it comes to machine learning. The possibilities are endless, it's really big.  

It would be really helpful if pre-process data sets were used in machine learning as well — If more data visualization options and pre-processing options were supported. That's something very basic that we need when doing machine learning. If that could be improved, that would be really great. And if more documentation was available, again, that would be great. You can find specific knowledge on YouTube, but you can't go much further than that because the resources are just not available. These are the reasons why I am giving it a six. 

With Python and R, you can do anything — you have that confidence, but with Weka, I don't have that confidence.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Data Science at Freelancer on UpWork
Real User
An excellent tool for data classification and clustering
Pros and Cons
  • "The path of machine learning in classification and clustering is useful. The GUI can get you results. No programming is needed. No need to write down your script first or send to your model or input your data."
  • "If there are a lot more lines of code, then we should use another language."

What is our primary use case?

I have only used Weka for classification and clustering. I have also used classification with embossing.

What is most valuable?

The path of machine learning in classification and clustering is useful. The GUI can get you results. No programming is needed. There is no need to write down your script first or send to your model or input your data. 

Weka classification is very valuable, it gives results, like prediction results. Weka is a little bit better than other tools I have expertise on. Weka is just much better for the classification path and clustering path. 

If you are going with some predictions that a procedure recalls, it's better than any other tool like R Programming and Python. In machine learning, like deep learning, if the network works, I can run it with the console buttons. 

What needs improvement?

I think there is a little bit of space for improvement.

For how long have I used the solution?

I've been using Weka for five years.

What do I think about the stability of the solution?

I'm not sure if it's reliable. It's a little difficult to get results, especially if you are on some other programs like Tableau.

How was the initial setup?

There is no complexity in the setup. It took a total of 10 minutes to set it up.

What's my experience with pricing, setup cost, and licensing?

I like how the classification and prediction work. We should use Weka because the path is very big and much better. If there are a lot more lines of code, then we should use another language.

Which other solutions did I evaluate?

I enjoy using Weka most of the time for machine learning and development. We only perform a task from data mining, classification, and collecting on top of Weka. TIBCO Jaspersoft is only for masterwork and analysis and visualization.

What other advice do I have?

I would give Weka a nine out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Weka
March 2024
Learn what your peers think about Weka. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
768,857 professionals have used our research since 2012.
Solution Architect / Data Scientist (upwork) at Freelancer
Real User
Has a good machine algorithm for clustering systems but is lacking a few newer algorithms
Pros and Cons
  • "I like the machine algorithm for clustering systems. Weka has larger capabilities. There are multiple algorithms that can be used for clustering. It depends upon the user requirements. For clustering, I've used DBSCAN, whereas for supervised learning, I've used AVM and RFT."
  • "I believe is there are a few newer algorithms that are not present in the Weka libraries. Whereas, for example, if I want to have a solution that involves deep learning, so I don't think that Weka has that capability. So in that case I have to use Python for ... predict any algorithms based on deep learning."

What is our primary use case?

Weka is a machine learning tool where we can use supervised and unsupervised learning tools to detect anomalies, for clustering, or classification algorithm.

The deployment method depends on the business's requirements. When I worked at the Air Force, it was all cloud. I deployed it on the cloud but that was treated as on-premise because that is confined within the Air Force. It depends upon the requirement of the user. If they want it on-premise, I can provide that. If they want it to be hosted on AWS or any other cloud services, that can also be done.

How has it helped my organization?

Our customers wanted a scale-based query to generate anomalies based on the data. We had a good experience when there is a small dataset or there is a known set of attributes. If you have at least a definition of the differences between attributes, then you can use the SQL, whereas in machine learning it is quite different. You don't have a case, it a kind of fuzzy logic being used to detect anomalies.

When they were using SQL they were getting they had quality data. We used Weka for a learning period, meaning how much data we have used to train and model to generate a condition. It was generating thousands of anomalies and those were not correct, because the attributes they were using and the SQL can be used with that difference between attributes at least.

When I used Weka for processing, I used these kinds of algorithms and it was very clear when I tested that output of the string algorithm using different techniques. I ran another Java program to check whether these anomalies are being properly predicted or not. So there I found that Weka is quite helpful compared to other programming techniques or the SQL-based solutions.

What is most valuable?

I like the machine algorithm for clustering systems. Weka has larger capabilities. There are multiple algorithms that can be used for clustering. It depends upon the user requirements. For clustering, I've used DBSCAN, whereas, for supervised learning, I've used AVM and RFT. 

Weka is useful for analyzing any data set you want to analyze or if you want to run algorithms of small data sets. When it comes to the enterprise solution, you can use Weka libraries or at least this algorithm that is very available in the Weka libraries. In Java, I can manipulate all these algorithms and the libraries of Weka to produce the desired result for a customer.

What needs improvement?

I believe there are a few newer algorithms that are not present in the Weka libraries. If I want to have a solution that involves deep learning, I don't think that Weka has that capability. In that case, I have to use Python to predict any algorithms based on deep learning.

What do I think about the stability of the solution?

Weka is a stable solution. It has been working well for the past two years. I spoke to a few of my work colleagues. Even a 40-year-old was built over on PowerPoint Weka frequencies still works well. So Weka is definitely a stable solution.

What do I think about the scalability of the solution?

Weka is not horizontally scalable. If I had to run a large dataset over Weka I would have to have a very large usage. If I add another node into Weka and I want to have a cluster environment for Weka, it will not work. If I have data from various sources and it's a large amount of data, if it's possible to speed into various parts, and I can view this data in two different machines I can install Weka into four machines and then I program and move this data into four machines.

In that way, Weka can be horizontally scalable, but as a solution, it is not horizontally scalable. It is vertically scalable.

Weka doesn't require maintenance. Once the solution is left and it is deployed nobody is required to maintain it. Weka is quite stable, it doesn't cause any problems. If you want to deploy this in your enterprise, they help to properly implement those profits. Once it is properly implemented no maintenance is required.

How are customer service and technical support?

I have never used their technical support. 

Which solution did I use previously and why did I switch?

Python is quite a hostile solution. If I get data it may not be in the format I request to run an analysis. Python is quite handy and it is easier than Weka to implement.

Weka provides a UI. If a person is very new to machinery or if somebody wants to run an analysis, Weka requires minimal programming but you need to have the knowledge of artificial learning. If somebody doesn't know it, they can't implement it. 

How was the initial setup?

The initial setup was very straightforward. I have been doing Java programming for the last 20 years. Java is quite easy for me. It is written in Java and it is open-source. All courses are available in the first course of the Weka library.

When I tried to implement a Weka solution along with Java for any customer, it is quite straightforward because I just need to put a dependency of their JAR file inside the project and then I can use all their function and capabilities that are provided by Weka. That can be applied very well. There is good documentation of that and there are examples of the processes where Weka's features could be implemented. It is quite easy to use.

The amount of time it takes to deploy depends on the requirements. For performance, it took me only a day, meaning eight hours of work, and I could provide a solution for the Weka part only. For the UI and for other things, that is different. 

Hardware took quite some time because the data was too large. Weka is not capable of handling a large amount of data. They wanted the solution to be Java and we didn't have any other libraries to do that. So I split out that data into the smallest chunks and then I ran these algorithms on that smallest data set. I combined that data and then manually produced the results. In that case, it took around six months to provide them a solution. It can take a day and then it can take up to six months.

Implementing the algorithm doesn't take much of your time. What takes time is how much data a customer has and how clean the data is. In terms of performance, it was quite a good data set. Every field of their attributes was available. There was a feature called collation-based features and I used that and it collated the results within a few minutes. Based on that, I implemented KLN on that. It is quite dependent on the data set the customer provides, how clean the data is, and what the output they want out of that data set is.

What was our ROI?

I think Weka is definitely a good investment, that is why we still use it. It has performance analytics as well so I think it is a better solution than others.

What other advice do I have?

Weka is pretty comprehensive and easy to use.

This is the first time that I used machine learning. I have a master's in technology. I analyze small data to get insights into algorithms. I learned a lot from all the files, then I implemented those into a Dell program.

It has many features that are not available and there is not much development since it is open source. It should be developed faster. I would rate Weka a six out of ten for these reasons. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Freelance Engineer at Autónomo
Real User
You can standardize data in an easier way but it should work with big data
Pros and Cons
  • "There are many options where you can fill all of the data pre-processing options that you can implement when you're importing the data. You can also normalize the data and standardize it in an easier way."
  • "The product is good, but I would like it to work with big data. I know it has a Spark integration they could use to do analysis in clusters, but it's not so clear how to use it."

What is our primary use case?

I used Weka for my Master's thesis. I've used it a couple of times for my personal usage or a quick analysis or graph. You can do a reselection quicker and you can get the graph and put it in our report and do classification. If any project is present, I could develop it.

How has it helped my organization?

For my Master's thesis, I could do a quick analysis. We have a huge amount of data. It was not big data, but it was many Giga. I had to work with it in batches, but it helped me to apply prediction models to do some hypotheses with the data to analyze the reason why some people were migrating, depopulating the rural towns, and going to the peripheral grounds around the big cities. In this case, I used many tools like Access, Weka, and some other GIS (Geographic Information Services) for mapping. But Weka helped me with this analysis in an easy way. I could select and apply models. It helped me to do it faster.

What is most valuable?

There are many options where you can fill all of the data pre-processing options that you can implement when you're importing the data. You can also normalize the data and standardize it in an easier way. You have to do it in Python and you have to write some lines but in this case, it's in data pre-processing. For the model evaluation, you can build models for classification. 

What needs improvement?

The product is good, but I would like it to work with big data. I know it has a Spark integration they could use to do analysis in clusters, but it's not so clear how to use it. In this case, it would be more how to handle big amounts of data. My project in my thesis was not so big. It was not 100 Gigabytes, but for sure these tools could be really useful. They should integrate it in a better way with Spark and have better cluster processing.

For how long have I used the solution?

I have been using Weka for one year. 

What do I think about the stability of the solution?

It crashes sometimes, but in this case, I think the product is good. You have to have a good computer to do a better analysis. I think it's stable but I think it can grow. It's good, but it's not the past product they have used for data mining, but it can give good results. You have to have patience, you have to take care of your memory, and take care of your CPU resources. It could improve its performance but it is stable.

What do I think about the scalability of the solution?

I have not exported Python and then used it in a Kubernetes application. In that way, scalability is not so good. It should improve. With the amount of data it can process without crashing, it has to work better with other plugins or integration with other tools. The scalability is not so good.

How are customer service and technical support?

We don't use their official support but we use their forums. There's a lack of information for specific points, but in general, I could find their answer to 80% of my questions.

Which solution did I use previously and why did I switch?

I have previously used Knime and Orange. Knime has better reviews. Weka has been used but it did not add too many features in the past few years. I think Weka and Orange are a bit stuck. Knime has grown a lot by adding more plugins, adding more capabilities for machine learning, and adding more algorithms. 

Knime is more spread out and has better functionalities. 

How was the initial setup?

The initial setup was straightforward. It is very intuitive and the tool is easy to use. There is some previous knowledge you have to have because you have to know some of the parameters are and what the influence on the final model is.

What other advice do I have?

Weka is good to start in data mining. The base had to be clear with base concepts about the models or algorithms you are going to use. You want to test or do some research first. But for production, it's not the best option. It would be a good tool for prototyping. Knime is the best tool for data mining. 

Weka is good for structured table data. You can use many supervised or unsupervised algorithms, but it's very difficult to get interpretable results about the multilayer option it has. It's not so easy to understand the neural networks if you work with Weka. It would be better to work with unsupervised algorithms like tree-based or clustering algorithms, but not for neural networks. There are other tools that can be more useful.

I would rate Weka a seven out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Oleksandr Ochkasov - PeerSpot reviewer
Consultant for the implementation of maintenance management and repair of equipment at IT-Enterprise
Consultant
Top 20Leaderboard
User-friendly analysis and graphics
Pros and Cons
  • "Weka's best features are its user-friendly graphic interface interpretation of data sets and the ease of analyzing data."
  • "Weka is a little complicated and not necessarily suited for users who aren't skilled and experienced in data science."

What is our primary use case?

I mainly use Weka to check data for anomalies.

What is most valuable?

Weka's best features are its user-friendly graphic interface interpretation of data sets and the ease of analyzing data.

What needs improvement?

Weka is a little complicated and not necessarily suited for users who aren't skilled and experienced in data science.

For how long have I used the solution?

I've been using Weka for less than a month.

What do I think about the stability of the solution?

Weka is stable.

What do I think about the scalability of the solution?

Weka is not scalable.

How was the initial setup?

The initial setup was easy and took around five to ten minutes.

What other advice do I have?

I would rate Weka eight out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
CEO with 11-50 employees
Vendor
Weka is a very easy to use Data Mining solution however the pre-processing part is the hardest to use aspect.

What is most valuable?

Weka is a very easy to use Data Mining solution. It is great for learning and for doing small experiments before exploring the data deeper. Another important feature is the number and diversity of algorithms that make Weka an excellent solution for rapid testing. The several interfaces that are provided also allow a diverse range of applications and uses. Another important aspect is the ability to easily integrate new algorithms to the solution, and also it’s integration in terms of Java code.

How has it helped my organization?

I have used Weka both in teaching and in industry projects, for several types of Data Mining tasks.

What needs improvement?

Scalability and performance are the main aspect of improvement in Weka, since it has the main Java limitations, regarding the JVM. Besides that, the pre-processing part of Weka is the hardest to use aspect of it.

For how long have I used the solution?

I've used it for more than 10 years.

What was my experience with deployment of the solution?

No issues with deployment.

What do I think about the stability of the solution?

No issues with stability.

What do I think about the scalability of the solution?

Yes, fine tuning the JVM memory is something to be careful about.

How are customer service and technical support?

Customer Service:

It is open source and customer service is not something that is given. However, there is a community of Weka users and some documentation that can help with the use of Weka.

Technical Support:

Same thing as the customer service.

Which solution did I use previously and why did I switch?

I used the old Clementine solution (now in the IBM portfolio). Weka ends up being more versatile, both in terms of diversity of algorithms, integration flexibility and there are less costs.

How was the initial setup?

The setup is straightforward, just download and start using.

What about the implementation team?

I implemented in-house and for other companies.

What was our ROI?

High, since it has low costs and is very easy to use.

Which other solutions did I evaluate?

Yes, but long ago. I evaluated Oracle Data Mining, Clementine, and SAS Enterprise Miner.

What other advice do I have?

Data Mining know how is needed to use the solution, but that is what is expected, since this tool is for Data Scientists.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Weka Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2024
Buyer's Guide
Download our free Weka Report and get advice and tips from experienced pros sharing their opinions.