Notes from Couchbase Live London (2014)¶
Contents:
Opening Keynote: Amadeus Case Study¶
Amadeus sits between travel providers and travels agents, linking up travel agents with hotels, flights, etc. The main users of their systems are travel agents (both traditional agencies such as Thomson and online agencies like Expedia).
They also work on tools for airlines and hotels for managing inventory. 800 million passengers will travel with flights arranged by Amadeus in 2015.
Handle up to 24,000 operations per second, with a 0.5 second response time, dealing in petabytes of data.
Were previously using Oracle in a non-relational manner (serialised blobs of data). Prototyped 2 projects in Couchbase in 2013.
Determining Flight Availability¶
Given a particular flight, date and the point of sale, they need to be able to determine availability (point of sale is relevant because of currency exchange rates - you might want to prioritise different regions depending on current exhcange rates).
When communicating with external airline inventories, queries are traditionally cached. This leads to problems:
- Non-competitive prices (fresher data might give cheaper seats);
- Rejecting bookings which could actually be fulfilled.
Customers can shift transactions to other brokers if results from Amadeus aren’t as good as a competitor’s results. Stale caches therefore cost money.
Prior to Couchbase, they had a memcached layer sitting on top of a MySQL farm sitting on top of Oracle. Scaling any of these layers was a major undertaking. The application layer currently has to know about the server topology, which makes scaling it or changing it impractical. memcached outages are very disruptive, since all load then goes to MySQL, which can’t handle it.
With Couchbase, they lose the MySQL and memcached layers, and just have 30 Couchbase servers, with 1TB of RAM each.
During node failover, they read from replicas (for their case, reading dirty data is better than no data) - this was a feature they requested, that came in Couchbase 2.1.0.
They operate a single data centre, with 6 different isolated zones - they can lose one zone without losing data.
They’re using Couchbase for a distributed session store, sharing users’ session information across the tier of apllication servers.
Features they’re after¶
- Partial bucket replication;
- A better security model;
- The ability to audit transactions.
Opening Keynote: Viber Case Study¶
Viber originated three years ago as an iPhone app for VOIP, they’ve now got text messaging, video chat, and a version for Android. Starting to make money by allowing calls to landlines and mobiles, and selling sticker packages.
They have over 300 million users, growing at 1 million a day. Billions of messages are send every month, and billions of minutes of voice.
Original Architecture¶
Viber originated with a tier of application servers, backed by an in-house, in-memory database.
In 2011, they used MongoDB, including a very early version of MongoDB’s sharding support. They then added a Redis cache (purely in memory) in front of MongoDB. Eventually, they had performance problems with MongoDB, so they became more and more reliant on Redis. Redis had no support for sharding, so they wrote their own frontend which dealt with sharding. MongoDB continued to be a problem for performance, so they moved some data to be solely in Redis (circa 2013).
Things that went well:
- It got them through an abrupt growth path;
- They never lost data in MongoDB;
- Redis has reliably good performance.
Things that went badly:
- MongoDB could only managed tens of thousands of operations per second, they needed hundreds of thousands. Large datasets cause MongoDB problems.
- MongoDB does not scale well with many application servers (it handles each connection in a separate thread, which ends up being very wasteful of CPU and memory).
- The inhouse sharding frontend for Redis wasn’t easily scalable (it could only handle increasing the number of servers by factors of two).
They ended up with:
- 1 MongoDB cluster with 150 servers (master and two slaves)
- 3 Redis clusters with a total of around 150 servers.
Current Architecture¶
Needs:
- Very high performance - up to a million operations per second;
- Handling very large data sets;
- Elasticity without performance disruption;
- Using AWS, so node failures are common - they need a system that can handle node failures without service disruption;
- Backup handling;
- High availability;
- Good monitoring capabilities;
- They’d prefer a single database - providing both a caching layer and a persistence layer.
They’ve opted for small nodes (about 60GB RAM per node). They have separate nodes for different types of operations (reads vs sets vs appends). They have at most 60 nodes per cluster. They use XDCR to backup to a backup cluster. All views are done on their backup clusters, not on production clusters. Backups are then sent to disk, and sent to S3.
They currently have over 400 application servers, along with:
- 7 Couchbase clusters, each with up to 60 nodes;
- 0-2 replicas, XDCR and external backups;
- They have a total of 150 Couchbase servers (less than half the number needed with MongoDB and Redis).
Their busy read clusters are handling hundreds of thousands of reads per second, on 10 node clusters.
Migration to Couchbase¶
Migration was done live, with no downtime, and with no data loss, and with data remaining consistent.
5 clusters are migrated, 1 is in progress, and 1 remains to be done. MongoDB is now retired, so there’s just 2 Redis clusters remaining.
They’re interested in Couchbase 2.5 for rack awareness of replicas (since they’re using AWS, they need to make sure their replicas aren’t in the same availability zones as the active copies).
They’re looking to use Elastic Search using XDCR for full text search.
Anatomy of a Couchbase App¶
J Chris Anderson demonstrated a web app somewhat similar to Snapchat, which:
- uses Couchbase for asset storage and keyspace management;
- doesn’t require complex queries - build your Couchbase key using predictable key patterns;
- uses Couchbase to provide ordering, immutable keys and expiry;
- is built with Express, React.js and PubNub.
How it works¶
- The app asks the server for a message ID;
- The server hands them out sequentially and atomically;
- The app saves images and audio to URLs based on the message ID;
- Optionally allow messages to self-destruct.
The latency sensitive path just uses key-value operations (getting documents that are referred to by other documents can be done in a reasonable timeframe using two round-trips). The messages in a room are just the messages from 0 to the current message number, so the app just does a bunch of sequential gets.
Scaling¶
Since there’s not a lot of logic on the application server (it’s effectively stateless), we can scale by adding machines and balancing load between machines.
The Art & Science of Document Modelling¶
Basics of Data in Couchbase¶
There are three types of keys in Couchbase:
- Deterministic - human readable;
- Random - e.g. UUID;
- Pseudo-random - e.g. user-<UUID> - human-readable random keys, but with the ability to figure out what sort of document it’s going to be based on the key.
There are three types of data in Couchbase:
- Primitive data types (e.g. 98.6 or true);
- JSON;
- Binary.
Each key-value pair has two components - metadata and the document. The metadata contains the key, the revision, some flags, expiration information, and type information. The metadata is 56 bytes, plus however many bytes the key is. Metadata is kept in RAM, to allow very quick responses in error conditions (e.g. calling add on a key that already exists).
Keys are partitioned into vBuckets by taking a crc32 hash of the key, and picking an appropriate vBucket. vBuckets are distributed across clusters, and rebalancing is the process of moving vBuckets. crc32 evenly distributes keys across vBuckets, and vBuckets are distributed evenly across servers.
Key Selection¶
Deterministic keys make fetching data very straightforward. For example - if users log in using email address, the email address makes a good key. However - what if users can also log in with a username? We can work around that by adding a lookup-key - if the username is dominic, and the email is dominic@example.com, we add two documents at sign up:
- dominic with the value dominic@example.com;
- dominic@example.com, which contains the account information.
Since read operations are fairly quick, we can just use two round-trips.
An alternative to this approach is looking up usernames with a view, which gives you the user record directly from the username. This will involve asking each node in the Couchbase cluster (since each node has the subset of the view for the records which are active on that node).
Couchbase Server Unplugged¶
Why do we need Couchbase Server?¶
Tape is dead, disk is tape, flash is disk, RAM locality is King.
—Jim Gray (2006)
As memory gets cheaper and cheaper, keeping data in RAM becomes a more attractive and practical proposition.
Increasingly data is represented as JSON at application boundaries.
What is Couchbase Server?¶
Couchbase Server allows for Layer Consolidation - one layer subsumes another, both layers benefit and share common services (as demonstrated in the keynotes from Viber and Amadeus) having fewer layers reduces the logic needed to keep them in sync.
Couchbase is a distributed, master-less, shared-nothing, memory-first, auto-sharded document databases that is built to perform at web-scale.
Couchbase is a document-database paired with a cache (i.e. we keep stuff in memory).
N1QL¶
N1QL (/ˈnɪkəl/) is an expressive, SQL-like query languages. The “N1” comes from non-first normal form, meaning we relax the requirements of first normal form, allowing for duplication of data. See my notes on the N1QL session.
Common use cases¶
- Social gaming
- Ad targeting
- User profile store
- Session store
- Content and metadata store
- High availability cache
Node.js & Couchbase - Full Stack JSON¶
To make a game API, you need to make things fast, and you need to be able to scale. Most players won’t pay you anything, so you need to be resource-efficient too, so that you can still make money.
node.js has an event-driven programming model. Most node.js apps have little state, and can be scaled by adding application servers (with appropriate load balancing). You then need a data store that can be talked to by multiple application servers.
Couchbase recognises JSON as a datatype, allowing us to do more complex queries on JSON values using views and map/reduce.
Note
Much of this session was demonstrating code, so there aren’t a lot of notes.
A N1QL for every Query¶
The real world isn’t all structured and ordered.
In the traditional (relational) data model, you need to fit the data you have into the model you’ve decided on. In the rich data model, your data is varied, and your schema may vary [1]. In the traditional (relational) data model, data must all fit into the same shape, and changing that shape can be costly. In the rich (document-oriented) data model, data can fit into a multitude of shapes, and those shapes can change easily.
SQL works well for the traditional data model, but not so well for the rich data model, where schemas are shifting.
What is N1QL?¶
N1QL embraces the JSON document model, with SQL-like syntax to ease transition. It works with structured, semi-structured, and unstructured data. JSON is fully supported, and more formats may be supported in future.
N1QL supports aggregations, filtering and transformations. You can, for example, transform one array into another array. In SQL when you run an operation, you get a result set consisting of a series of rows. In N1QL, you operate on JSON documents, and the result of your operation is another JSON document.
The “N1” comes from non-first normal form - we have multivalued attributes and nested objects.
Whilst views are fairly heavy weight, N1QL allows you to have high performance declarative indexes which are lighter weight.
N1QL Basics¶
A query in N1QL has three parts to it:
- SELECT - what parts of each document do you want?
- FROM - what data bucket or data store do you want it from?
- WHERE - what conditions must be met for a document to be returned?
The output is in the form of a JSON document.
N1QL supports string operations such as concatenation and string matching (including support for wildcards).
N1QL also supports GROUP BY, ORDER BY, as well as pagination constructs such as LIMIT and OFFSET. Other functions such as AVG, ROUND, TRUNC, SUM, MIN and MAX [2].
N1QL handles missing values differently from NULL values (missing means that attribute doesn’t exist in the document, NULL that it does exist, but is NULL). You can select between these two conditions using IS MISSING and IS NULL.
N1QL also allows you to join data between multiple buckets.
[1] | Ilam said he doesn’t like the term schemaless, since data in NoSQL databases still have schemas, it’s just that they’re flexible, and may vary between values. |
[2] | Ilam mentioned that a commonly asked question was whether user-defined functions are supported. This feature is planned, but not currently available. |