Open in app

Sign In

Write

Sign In

Dima Statz
Dima Statz

209 Followers

Home

About

Sep 28, 2022

Visum — A Cloud Cost Optimization Platform. Part 1

Background The worldwide infrastructure as a service (IaaS) market grew 41.4% in 2021, to total $90.9 billion, up from $64.3 billion in 2020. It is expected to be as high as $121.62 billion in 2022. Cloud adoption is still growing, and the vast majority of enterprise organizations and more than 50%…

Apache Spark

5 min read

Visum — A Cloud Cost Optimization Platform. Part 1
Visum — A Cloud Cost Optimization Platform. Part 1
Apache Spark

5 min read


Published in ITNEXT

·May 10, 2021

Monitoring Spark Streaming on K8s with Prometheus and Grafana

Introduction Cost Efficiency and Portability are the main reason to migrate Apache Spark workloads from managed services like AWS EMR, Azure Databricks, or HDInsight to Kubernetes. You can learn more about the migration process from AWS EMR to K8s in the following article. However, there are also potential pitfalls with leaving…

Apache Spark

5 min read

Monitoring Spark Streaming on K8s with Prometheus and Grafana
Monitoring Spark Streaming on K8s with Prometheus and Grafana
Apache Spark

5 min read


Published in ITNEXT

·Dec 28, 2020

Benchmarking Graviton2 processors with Apache Spark workloads

Introduction Amazon EC2 provides a broad portfolio of compute instances, including many that are powered by the latest-generation Intel and AMD processors. AWS Graviton2 processors add even more choice. …

Emr

7 min read

Benchmarking Graviton2 processors with Apache Spark workloads
Benchmarking Graviton2 processors with Apache Spark workloads
Emr

7 min read


Published in ITNEXT

·Nov 14, 2020

Processing costs measurement on multi-tenant EMR clusters

Introduction One of the 5 pillars of the Well-Architectured Framework is Cost Optimization. The Cost Optimization pillar focuses on avoiding unnecessary costs, selecting the most appropriate resource types, analyzing spend over time, scaling in/out in order to meet business needs without overspending. …

Emr

6 min read

Processing costs measurement on multi-tenant EMR clusters
Processing costs measurement on multi-tenant EMR clusters
Emr

6 min read


Published in ITNEXT

·Sep 30, 2020

Migrating Apache Spark workloads from AWS EMR to Kubernetes

Introduction ESG research found that 43% of respondents considering cloud as their primary deployment for Apache Spark. And it makes a lot of sense because the cloud provides scalability, reliability, availability, and massive economies of scale. Another strong selling point of cloud deployment is a low barrier of entry in the…

Apache Spark

10 min read

Migrating Apache Spark workloads from AWS EMR to Kubernetes
Migrating Apache Spark workloads from AWS EMR to Kubernetes
Apache Spark

10 min read


Published in ITNEXT

·Jul 9, 2020

Monitoring the performance of software teams using Github, Jira, and Grafana

Introduction COVID19 made a huge impact on the world. Every single aspect of our lives went through a massive change and just like many other software development teams in the world, our team went fully remotely. I have no idea when our office will be open, and probably, offices may never…

Jira

8 min read

Monitoring the performance of software teams using Github, Jira, and Grafana
Monitoring the performance of software teams using Github, Jira, and Grafana
Jira

8 min read


Published in ITNEXT

·May 28, 2020

Monitoring Distributed Jetty Servers in K8s using Prometheus and Grafana

Introduction Monitoring and alerting is a mandatory part of any software system running in a production environment. To keep software systems healthy, to optimize performance and resource utilization, you need a unified operational view, real-time granular data, and historical reference. …

Kubernetes

6 min read

Monitoring Distributed Jetty Servers in K8s using Prometheus and Grafana
Monitoring Distributed Jetty Servers in K8s using Prometheus and Grafana
Kubernetes

6 min read


May 17, 2020

No-Code Data Collect API on AWS

Introduction This article is all about moving data into Big Data Pipelines running on AWS. Since most data pipelines have 5 steps in common: collection -> storage-> processing -> analysis-> visualization, AWS has a very solid foundation for building all these steps. …

Data Pipeline

9 min read

No-Code Data Collect API on AWS
No-Code Data Collect API on AWS
Data Pipeline

9 min read


Published in ITNEXT

·Apr 30, 2020

Handling Data Skew in Apache Spark

Introduction One of the well-known problems in parallel computational systems is data skewness. Usually, in Apache Spark, data skewness is caused by transformations that change data partitioning like join, groupBy, and orderBy. For example, joining on a key that is not evenly distributed across the cluster, causing some partitions to be…

Apache Spark

8 min read

Handling Data Skew in Apache Spark
Handling Data Skew in Apache Spark
Apache Spark

8 min read


Feb 3, 2020

Vertica DB performance test with locust.io

“Define user behavior with Python code, and swarm your system with millions of simultaneous users” — locust.io Introduction Vertica is a very powerful analytic database designed to handle massive amounts of data. When configured right, Vertica enables a great query performance even in very intensive scenarios. Some Vertica users report that…

Python

6 min read

Vertica DB performance test with locust.io
Vertica DB performance test with locust.io
Python

6 min read

Dima Statz

Dima Statz

209 Followers
Following
  • Alonso Del Arte

    Alonso Del Arte

  • Darius Foroux

    Darius Foroux

  • Eyevinn Technology

    Eyevinn Technology

  • Benjamin Obi Tayo Ph.D.

    Benjamin Obi Tayo Ph.D.

  • Kevin Boller

    Kevin Boller

See all (26)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech