Badges

User Activity

14 days ago
The open source Apache Airflow is a free to use. It itself does not incur any cost. But the managed solution by AWS or GCP have cost and other packaged product like Astronomer too. 
18 days ago
Near Real time analytics using Near real time data ingestion.
18 days ago
I don't think using Apache Spark without Hadoop has any major drawbacks or issues. I have used Apache Spark quite successfully with AWS S3 on many projects which are batch based. Yes for very high performance system HDFS is a better option.  The main problem with Apache…
18 days ago
If you are dealing with semi-structured data like json Snowflake has great support in handling and querying json data. it is also good to use as data lake and can act as one stop solution for a data lake and cloud data warehouse. query performance and low maintainability is…
18 days ago
It is little more costly but it has great features to keep the control on pricing if utilised properly like different warehouse sizes, caching, auto-suspend of warehouses and some more.
18 days ago
handling of semi-structured data like json. It has great support for json and we can write sql on json which is amazing. the performance on semi-structured data is little poor as compared to structured data but it is still great.
18 days ago
Very good review on Snowflake, very helpful.
18 days ago
Apache Airflow is a great orchestration and automation tool. Its connectivity with other systems is a great plus point. The interactive UI, the options for scheduling and the very fact that its compatibility with Python.
4 months ago
The CDP I used was almost 2.5 years ago on-premise. I would rate it 8/10. I did not have much to compare against in those days and due to Cloud not accessible in my organisation. But, definitely CDP was a good choice then wrt to open source distribution. The installation was…
4 months ago
Have you used Azure Data Governance tool Purview ? If yes, what's your view and is it mature enough?
5 months ago
Snowflake is an amazing Product. It is one of the best Warehouses currently in for Cloud. Separation of store and compute and the Warehouse concept makes this unique and it has lots of features, low maintenance and the cost can be optimised to a great extent if we understand…
5 months ago
Many features: 1) Separate warehouse and the control user gets on it. 2) Auto caching features 3) Json and XML handling 4) Minimal DBA activity
5 months ago
We are using it as a Datalake and a DWH.
5 months ago
Have you used Azure Purview for Data Governance, Data Lineage and as Data Catalog ?
5 months ago
1. TCO : options of Long term commitment vs pay as you go 2. Ease of setup , security & performance 3. High availability & Support

Reviews

Questions

Answers

18 days ago
Software Configuration Management
18 days ago
Business Process Management (BPM)
5 months ago
Data Warehouse

Comments

4 months ago
Infrastructure as a Service Clouds (IaaS)
5 months ago
Infrastructure as a Service Clouds (IaaS)

About me

I have 17+ years experience in building software. I primarily worked on Java, Spring and Database systems initially and then moved to Distributed systems in the last 5 years. I have worked on Apache Spark, Hadoop, Hbase, Hive, Kafka, Hbase etc. Also, have got exposure to work on multiple cloud technologies on AWS, Azure and GCP.