Enterprises must carefully evaluate how well cloud-based databases and analytics tools meet current needs and balance cost, complexity, performance, and flexibility. Steve Sarsfield, director of product marketing for Vertica, shares key questions to ask before choosing a cloud database.
Whether you plan to move your analytics workload to a single public cloud provider, multiple cloud providers, on-premises, or a hybrid infrastructure, the database you choose has a significant impact on cost, performance, and business value. With so many database and analytics vendors on the market at various stages of development, here are some key questions IT teams can ask before focusing on a cloud-based database.
Can it be deployed anywhere?
Some SaaS platforms require all data to be uploaded to one specific cloud. This locks the customer into a single solution without allowing the freedom to easily migrate to another cloud or take advantage of cheaper computing when available. When a SaaS solution is labeled as “cloud,” it may actually mean “in the cloud only,” and it may not be able to handle workloads elsewhere. Essentially, you’re forced to upload data to a single cloud and analyze it with a single engine. Invoicing convenience can be an advantage here. However, the business will be limited to only one type of cloud deployment.
More details: Top 5 Reasons to Move Databases to the Cloud
Can you use data from anywhere?
External data and data lakes are becoming more common in enterprises, but analytics solutions can vary widely in how they manage workloads and data storage. You want to be able to access the data both inside and outside the database. You want to give users access to more data sources, even if they are not loaded into the database. The time you spend loading data into the database and the amount of money an organization will pay while the data is stored in the cloud can make a big difference. Keeping all data in one type of database is a mistake, as most modern solutions have combined a data warehouse and a data lake for analytics.
Is it customizable and flexible?
When users are trying to deal with slow queries, node-based optimization is usually recommended for cloud databases. If your requests are slow, many cloud systems will add more nodes to increase processing power. However, analytical workloads are not one-size-fits-all, and database performance can be affected by quarterly reports, a successful marketing campaign that resulted in more data, or poorly written queries.
That’s why it’s important to understand what options are available to speed up your queries. Look for systems that offer a massively parallel architecture (the architecture may require manual segmentation and special modification of queries to use the cluster); node scaling (node scaling and control of node size or configuration); load management (matching request resources, such as memory and CPU, to specific types of requests or a specific set of users); separation of compute and storage (data is stored in object storage, while compute nodes are spun up to serve parallelism, backups, dashboards, and data science) and query optimization (query schedulers that determine the best way to limit data reads and memory , necessary to answer the request).
Does it support different analytic user roles?
As the cloud database becomes popular in the organization, be prepared to support a wide range of requests from different users (business users, analysts, data scientists). Are the features offered and at what price?
One should always consider the depth of the analytical functions offered. Features may include:
- Time series: SQL functions are built into the database to log data written at specified time intervals.
- Geospatial: SQL functions based on latitude, longitude and altitude.
- Machine learning: Ability to train, manage and deploy machine learning models.
- Alternative frames: Support for data science and additional languages beyond SQL.
Does it help you control costs?
When moving analytics to the cloud, costs can quickly spiral out of control, and businesses can enter into long-term contracts. Make sure the solution allows users to reduce billing when not in use and has the ability to set clear spending limits so there are no surprises at the end of the month. IT teams need to understand how the database automatically scales for long or complex queries, or when many concurrent workloads hit the system at once. When the database automatically creates additional nodes, these additional nodes are automatically billed monthly.
The ability to have shared storage is also critical because it allows multiple teams to use the same data without creating copies. More copies equals more memory equals more money. Last but not least, many providers charge an exit fee per megabyte of data received from their platform. Be wary of platforms that charge for data returns.
The reality is that business, technology, consumer expectations, and the regulatory environment are evolving so rapidly that no one can predict the analytics and storage requirements of the future. Today, we could move data to the cloud. Tomorrow we can move it back to standard mode, and some days we’d like to use both. That is why the chosen database must be flexible, scalable and ready for the future.
MORE ABOUT CLOUD DATABASE
https://www.spiceworks.com/tech/cloud/guest-article/key-questions-to-ask-before-selecting-a-cloud-database/ Key questions to ask before choosing a cloud database