What is our primary use case?
In our company, we have about 2000 workers. I use K-means algorithms all the time. So, for example, say we want to classify workers to do better studies by groupings. Maybe we want to target groups of workers who tend to make more errors so we can concentrate on reducing the effects of this issue. We can further classify the grouping in terms of age and in terms of year of services in the company or other ways that may be helpful. After we have found the grouping of people, we can do other types of analysis and the result of the study can be used to apply solutions. In this case, where we are looking to reduce errors, it might be a solution to deploy more supervision and evaluate more of the variables from the group of the employee that we have isolated to analyze performance issues. For example, we may want to reduce excessive overtime to enhance alertness. We are use TFS (Team Foundation Server) and we use the K-means algorithm to help optimize the behaviors in the groups that we identify.
I think in the future this kind of algorithm will be used widely by people to predict many things and to improve the performance of businesses and individuals.
What is most valuable?
I like that the solution provides many built-in options for us. They have many existing algorithms that we can use. Using them effectively depends on us because we have to understand how to utilize any particular solution. But actually, the product is at a good level when it comes to ease-of-use. You have to have the data first to work with. You retrieve data you want to use from the database and you format the data and put it to the SPSS. Then you do analysis and from that, you can take better steps towards a business solution based on the results the analysis returns.
What needs improvement?
The areas of SPSS Statics that have room for improvement — where it can be simplified to make it better — have to do with how you connect the SPSS solution with the data. They could provide enhancements like a query console so we can connect to the database and feed data more easily into SPSS. That could be really beneficial and save time and effort. Just enhancing the tools to collect the data would be significant.
One other enhancement I would like to see is the ability to create higher-level presentations. When you are finished processing the data, you come up with the result. If you are a technician or a data analyst it is not so hard to look at the data. But what I think would be a nice addition is to create presentations. The bosses at a higher level do not want to see the raw data and to show the data is not so nice and easy for them to understand. Some ways to do more to create a presentation — something that is easier to look at and understand — would be good. The ability to do this should be improved.
They need to improve the ability to connect to the databases, provide a query console and query windows. The engine for processing is there already, but the data feed has to be better. Then provide ways to present the result differently. Maybe graphic presentation options would be good so understanding the results can be more intuitive for the user.
For how long have I used the solution?
We have been using the IBM SPSS solution for about a year.
What do I think about the stability of the solution?
The stability of SPSS Statistics doesn't seem to have bugs, glitches, or crashes. It's pretty stable and it functions well under the loads we have tested. The number of SPSS users is currently not more than 10 people. The data that we process with the SPSS is never really more than 2000 rows. So we only know that it is stable at this level of usage so far. If there is an opportunity to process 100,000 rows, I don't know what the performance would be. From what I know we would not see any problems, but so far our usage is limited.
What do I think about the scalability of the solution?
While we don't really know yet what the performance will be like under greater loads for big data, I am pretty confident that the product is very scalable.
How are customer service and technical support?
We have had contact with the IBM SPSS technical support team through the phone and they help us through what they call the TeamViewer. They can assist us promptly and I think the service is good. We are satisfied with the level of technical support we receive.
Which solution did I use previously and why did I switch?
We did not use a different solution because we are trying to stay with Oracle and IBM solutions for compatibility and dependability.
How was the initial setup?
The initial setup of the product is easy.
What about the implementation team?
We did not do the deployment ourselves, we had someone do it. Even though it was not me who did the deployment, I know how they did it and was involved with the group who did it. So I am experienced with the initial setup but asked for assistance from integrators to not divert internal resources and be sure it was done right.
What other advice do I have?
My advice to other users is that if they are not doing proper data analysis that there are other, better solutions like SPSS to predict and to analyze the data. Assuming the data is already there, they may just be printing a report or pulling a PDF or filling information into the database. But they are not really doing what they can with the data to use it.
Now there are better solutions we can use and should use to produce information from the data in a way we did not before. It is a better way and the way of the future for data analysis. We need to use predictive algorithms and AI to get more from the information we collect and use it to create better solutions for our businesses. Data analysis is different now than it was only maybe 5 years ago. We can perfect the data and come up with facts from that analysis that we can use for predictions that will help make looking into the future with prediction more viable and accurate.
On a scale from one to ten where one is the worst and ten is the best, I would rate IBM SPSS Statistics as a 7 out of 10. To improve that score they need to enhance the data access and export so that it will be less work, add dashboards for queries and give more options for reporting output.
Which deployment model are you using for this solution?