Have I started hating mysql and falling in love with distributed databases

It seems Mysql is rock solid if you want:

Transactions
ACID support

So I would still recommend mysql for any thing that is mission critical data and is the primary datastore for your transactions. But what about derived data or analytical data?

I had built large scale cluster of mysql server storing metadata about billions of files and folders used by tens of thousands of customers daily and its scaling fine and working good, its still growing at a healthy rate and holding up. But this requires a lot of baby sitting if you have 100s of nodes and you need to do

replication
add more nodes
rebalancing data
monitoring entire cluster
Sharding
Backup/restore

You have to write a lot of tooling and lot of monitoring/babysitting to scale the cluster. Plain stock Mysql will scale up to a limit but vertically scaling has its own issues. So +1 for Mysql but not everything should be stuffed there.

But recently me and my team built full text indexing on same dataset using elasticsearch and it seems so far it hasn’t disappointed me. With just 1 engineer and 1 devops guy we are able to build a cluster per datacenter
to store same data. The thing I liked most about elastic search was half way through migration we started facing performance issues and we just added more nodes and the cluster rebalanced itself. Also elastic search has tools like kopf/HQ where I can monitor all nodes in one places. For e.g. this is one of the smallest cluster that we just started migrating and as it grows if we see high load averages then we can add more data or client nodes.

I dont need an army of dbas to manage the cluster as elasticsearch has built in support for

replication
add more nodes
rebalancing data
monitoring entire cluster
Sharding

I had earlier built an event store to store events on top of mysql but I was storing only few months of events into it. Now I have to build a store that can store events for 7 years and I dont want to use Mysql for it as I dont want to baby sit it. Our events data is way way more than the metadata. Because every change to a file generates an event and over 7 years this could be a huge no of records. I dont want to manage an army of mysql servers so researching for some database that has good querying support and can store long lived data with eventual consistency and rock solid durability.

Comments

Consistency

Consistency is important as it makes people feel at home and instills trust in the brand. The iOS ecosystem makes it easy for people to transition from one device type to other because it has a consistent look and feel, actions and this way the brain doesn't have to apply any cognitive effort when moving from iPhone to iPad to a Mac. I learned about Keynotopia recently and was enlightened by its application. I saw the transition of a bootstrapped Startup to a consistent Enterprise Startup and this is why I see the importance of consistency introduction in every department at Startup as early as possible. The UX style needs to be consistent with the entire product line and a company-wide style guide maintained by the UX group is absolutely important from very early days of the Startup. Your button color, fonts, logos need to be consistent in each product and even company presentations. From the moment a customer lands on your Marketing website to the trial perio...

Adventures of a nature lover - 5 national parks in 14 days

To unplug from work and recharge myself I do a 2-3 week trip every year where I am unplugged. Few of the reasons I can totally unplug from work is Unlimited Vacation policy of Egnyte, Excellent support by the Infrastructure team Our ethos of pro-actively fixing issues before they become nuisance. TLDR; It's a long post so you can scroll down and first see see images if you need motivation to read it entirely. Me and my family like national parks and camping to recharge us as there is no cell phone coverage in parks and you are completely unplugged from technology most of the times. We have done many of the national parks nearby and this year we want to see glacier national park as the glaciers may disappear in 10-15 years so see them before they are gone. Behind every successful trip is a "Trip planner" and for our family its my wife, she researched and made a trip itinerary book. She booked camp sites 6 months in advance She researched trail...

Compartmentalization helps with Deep Work

I had been trying to learn Solidity/Ethereum over the weekends for the past few months and the first 3-4 weekends were a drag as no matter what I do I wasnt able to focus and getting no where. The problem was not with motivation as I was trying to do it for many weeks but all I was able to do was read 100s of blog posts about it but not able to code anything. Aparently I realized that on weekdays most of my work is in the "study" room whereas on weekend I was trying to do it in the living room. Now working in study on wekeend was an issue as it felt more like work than fun so last 2 weekends I tried changing the schedule and went 3 hours every Sunday to library with my son and while he was reading books I was coding in solidity. I also had trouble writing code after 7:00 PM as I thought my brain was tired but last week I tried sitting in study around 10:00 -11:00 in night and boy I was able to focus and code. Net Net I realized that: "Having a consistent Routine ...

A startup guy

Search This Blog