How do you or your organization use this solution?
Please share with us so that your peers can learn from your experiences.
We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.
We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.
We use this solution to build skill and text classification models.
The primary use is for data management and managing workloads of data pipelines. Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.
We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions. My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.
We are a consulting house and we employ solutions based on our customers' needs. We don't generally use products internally. I am a certified data engineer from Microsoft and have worked on the Azure platform, which is why I have experience with Databricks. Now that Microsoft has launched Synapse, I think that there will be more use cases.
Currently, I am using this solution for a forecasting project.
Our primary use case is to decrease costs and prevent any security press on data. I'm an IT manager and we are customers of Databricks.
We specialize in project consulting for our clients. Whenever we get the opportunity, we recommend Databricks to them.
Our primary use case of Databricks is for advanced analytics. I'm the chief research officer of the company and we're customers of Databricks.
We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data.
My division works with Big Data and Data Science, and Databricks is one of the tools for Big Data that we work with. We are partners with Microsoft and we began working with this solution for one specific project in the financial industry.
I am a data scientist here and that is my official role. I own the company. Our team is quite small at this point. We have around five people on the team and we are working with about five different businesses. The projects we get from them are massive undertakings. Each of us on the team takes multiple roles in our company and we use multiple tools to help best serve our clients. We are trying to look at creative ways that different solutions can be integrated and we try to understand what products we can use to create solutions for client companies that will be effective in meeting their needs. We are personally using Databricks for certain projects where we want to consider creating intelligent solutions. I have been working on Databricks as part of my role in this company, trying to see if there are any kind of standard products that we can use with it to create solutions. We know that Databricks integrates with Airflow, so that is something that we are exploring right now as a potential solution for enabling a creative response. We are exploring the cloud as an option. Databricks is available in Azure and we are currently figuring out the viability of using that as a cloud platform. So we are exploring the way Databricks and Azure integrate at the same time to give us this type of flexibility. What we use it for right now is more like asset management. If we have a lot of assets and we get a lot of real-time data, we certainly want to do some processing on some of this data, but you do not want to have to work on all of it in real-time. That is why we use Databricks. We push the data from Azure through Databricks and work on the data algorithm in Databricks and execute it from Azure with probably an RPA (Robotic Process Automation) or something of that sort. It intelligently offloads real-time processing.
We are still exploring the solution. We utilize it much, much better than their star schema models that they are trying to replace it with. We bring in Databricks and then see how they can leverage the additional analytical functionalities around the Databricks cloud. It's more in exploratory ways. We recommend Databricks, especially with the Azure cloud frameworks.
We use the solution for multiple items. We use lots of data crunching, development, and algorithms on it.
We are building internal tools and custom models for predictive analysis. We are currently building a platform where we can integrate multiple data sources, such as data that is coming from Azure, AWS, or any SQL database. We integrate the data and run our models on top of that. We primarily use Databricks for data processing and for SQL databases.
We primarily use the solution to run current jobs; to run the spark jobs as the current job.
We use this solution for streaming analytics. We use machine learning functions that output to the API and work directly with the database.
I am a developer and I do a lot of consulting using Databricks. We have been primarily using this solution for ETL purposes. We also do some migration of on-premises data to the cloud.
Our primary use case is really DevOps, for integration and continuous development. We've combined our database with some components from Azure to deploy elements in Sandbox for our data scientists and for our data engineers.
We are using this solution to run large analytics queries and prepare datasets for SparkML and ML using PySpark. We ran on multiple clusters set up for a minimum of three and a maximum of nine nodes having 16GB RAM each. For one ad hoc requirement, a 32-node cluster was required. Databricks clusters were set for autoscaling and to time out after forty minutes of inactivity. Multiple users attached their notebooks to a cluster. When some workloads required different libraries, a dedicated cluster was spun up for that user.
Let the community know what you think. Share your opinions now!