Snowflake has made it’s rounds in the news, lately. At a glance, it’s a cloud-based data platform. If you go a little deeper, it offers security, scalability, and a lot of ways to share data. Let’s get started on learning the ins and outs of Snowflake.
Aside from having a catchy name, Snowflake has been taking the data-warehousing platform by storm. Large companies like Adobe, Square, and CapitalOne are flocking to adopt this new technology. In a seemingly innovative region of Silicon Valley, what makes Snowflake the talk of the town. With its highly successful IPO launch in September of this year, what makes Snowflake such a great enterprise solution for the future.
How We Used to Store Data
In traditional data stores, companies built large servers in-house with specialized data engineers to manage them. I still remember going into dedicated storage rooms with a farm of data racks. I’ll never forget working in the freezing cold temperatures and the continuous server humming noises. As time goes by, the amount of data we collect has grown exponentially. This data drives companies’ revenue and guides future products. Most people would agree that data is the 21st century’s version of oil. With this abundance of data, in-house storage has become unmanageable and requires an exorbitant number of resources. This is where Snowflake has positioned itself with an enterprise solution using a cloud data warehouse platform.
What Makes Snowflake Data Warehouse Different?
Snowflake is an easy-to-use data warehouse supplied as a Software-as-a-Service (Saas). The beauty of Snowflake is that there is no virtual or physical hardware to take care of. It is a managed service that works with all the major cloud computing providers. All software updates and maintenance of the platform are handled by the Snowflake team. Utilizing the capabilities of cloud infrastructure, customers can spin up or down data server depending on the requirements and load of the customer. This reduces the time and cost of a traditional in-house data storage solution. Gone are the days of buying more hardware whenever more storage is required. A simple click of a button enables customers to expand or reduce the data warehouse footprint. The amount of flexibility in configurations caters to small to enterprise-level companies. Store a near unlimited amount of data at affordable cloud rates and Snowflake only bills you based on the computing resources. The flexibility and scalability of Snowflake make it a very cost-effective solution to data warehousing.
Snowflake has 3 different layers:
- Database Storage
Snowflake organizes the data into multiple partitions that are compressed, optimized, and formatted. The data is optimized in cloud storage and works as a shared-disk model. This provides easier data management as users do not have to worry about the distribution of data across multiple nodes. All this magic is handled behind the scenes by Snowflake and not visible to customers. Data can be access using traditional SQL query operations using Snowflake.
- Query Processing
The query processing layer is where queries are executed. It uses data from the database storage layer for computation. Snowflake utilizes “Virtual Warehouses” for its computing needs. A virtual warehouse is MPP clusters consisting of multiple nodes with CPU and Memory provisioned on the cloud by Snowflake. Depending on the workloads, Snowflake can provision multiple virtual warehouses. Since each warehouse is an independent compute cluster and does not share resources with others, there is no impact on performance as more clusters scale up.
- Cloud Service
All activities such as authentication, security, metadata management of loaded data, and query optimizer are handled in this layer.
Example of services in this layer:
- Infrastructure Management
- Query submitted to Snowflake is optimized is this layer before sending to Query Processing
- Metadata required to optimize query or for filtering data is stored in this layer
- Access control
Pros and Cons of Snowflake Data warehouse
Now that I’ve highlighted some of the features and architecture of Snowflake, let’s summarize the pros and cons
- Storage capacity is no longer an issue
Utilizing cloud infrastructure, worrying about storage space is a thing of the past
- No more buying big servers
Racks of servers? Who needs that anymore? Let the cloud provider worry about it. With the maturity of the cloud and an uptime percentage of 99.99%, managing an in-house server is a thing of the past.
Data security is always the most important requirement for any company. Snowflake provides multi-factor authentication, AES 256 secure encryption, and IP whitelisting.
- Performance tuning
Snowflake self-governs its own performance tuning. As long as the customer follows the best practices guide, the system will be optimized for high performance.
- Software maintenance and upgrades
Like all cloud-managed services, Snowflake handles all maintenance requirements behind the scenes.
- Performance and Concurrency
Snowflake utilizes cluster resizing for handling workload changes. It can automatically scale up or down depending on the needs
- Dependent on Cloud Provider
Still have a dependency on the cloud provider. If AWS goings down, it will directly affect your data warehouse
- No support for unstructured data
Currently a relational database.
This barely scratches the surface of what Snowflake can provide. Things like time travel and data analyst capability add to a plethora of features. Snowflake lets customers focus more on their core product and less on infrastructure. It provides an easy and effective solution to anyone’s data storage needs.
If you are looking to transfer to the cloud or perhaps you already are using it and just don’t know how to make the most of it, we can help. We like to be on top of all the new tech out there, just like Snowflake and make sure we are able to provide honest and supportive service so you can do what you do best.