Summary of Alma-OSF’s
Evaluation of MongoDB
for Monitoring Data
Heiko Sommer
June 13, 2013
Heavily based on the presentation by
Tzu-Chiang Shen, Leonel Peña
ALMA Integrated Computing Team Coordination & Planning Meeting #1
Monitoring Storage Requirement
Expected data rate with 66 antennas:
¾150,000 monitor points (“MP”s) total.¾MPs get archived once per minute
• ~1 minute of MP data bucketed into a “clob”
¾~ 7000 clobs/s ~ 25 - 30 GB/day, ~10 TB/year
• 2500 clobs/s + dependent MP demultiplexing + fluctuations
¾~ equivalent to 310KByte/s or 2,485Mbit/s
Monitoring data characteristic
¾Simple data structure: [ID, timestamp, value] ¾But huge amount of data
Prior DB Investigations
Oracle: See Alisdair’s slides. MySQL
¾ Query problems, similar to Oracle DB
HBase (2011-08)
¾ Got stuck with Java client problems
¾ Poor support from the community
Cassandra (2011-10)
¾ Keyspace / replicator issue resolved
¾ Poor insert performance: Only 270 inserts / minute (unclear what size)
¾ Clients froze
These experiments were done “only” with some help from archive operators,
no-SQL and document oriented.
The storage format is BSON, a variation of JSON.
Documents within a collection can differ in structure.
¾ For monitor data we don’t really need this freedom.
Other features: Sharding, Replication, Aggregation
(Map/Reduce)
Very Brief Introduction of
MongoDB
SQL mongoDB Database Database Table Collection Row Document Field Field Index IndexVery Brief Introduction of
MongoDB …
A document in mongoDB:
{
_id: ObjectID("509a8fb2f3f4948bd2f983a0"),
user_id: "abc123",
age: 55,
status: 'A'
}
Schema Alternatives
1.) One MP value per doc
One MP value per doc:
A clob (~1 minute of flattened MP data):
Schema Alternatives
2.) MP clob per doc
One monitor point
data structure per day
Monthly database
Shard key = antenna + MP,
keeps matching docs on the same node.
Updates of pre-allocated
documents.
Schema Alternatives
Advantages of variant 3.):
¾Fewer documents within a collection
• There will be ~150,000 documents per day
• The amount of indexes will be lower as well.
¾No data fragmentation problem
¾Once a specific document is identified ( nlog(n) ), the
access to a specific range or a single value can be done in O(1)
¾Smaller ratio of metadata / data
Query to retrieve a value with seconds-level
granularity:
¾Ej: To get the value of the
FrontEnd/Cryostat/GATE_VALVE_STATE at 2012-09-15T15:29:18. db.monitorData_[MONTH].findOne( {"metadata.date": "2012-9-15", "metadata.monitorPoint": "GATE_VALVE_STATE", "metadata.antenna": "DV10", "metadata.component": "FrontEnd/Cryostat”}, { 'hourly.15.29.18': 1 } );
How would a query look
like?
Query to retrieve a range of values
¾Ej: To get values of theFrontEnd/Cryostat/GATE_VALVE_STATE at minute 29 (at 2012-09-15T15:29) db.monitorData_[MONTH].findOne( {"metadata.date": "2012-9-15", "metadata.monitorPoint": "GATE_VALVE_STATE", "metadata.antenna": "DV10", "metadata.component": "FrontEnd/Cryostat”}, { 'hourly.15.29': 1 } );
How would a query look like
…
A typical query is restricted by:
¾Antenna name ¾Component name ¾Monitor point ¾Date db.monitorData_[MONTH].ensureIndex( { "metadata.antenna": 1, "metadata.component": 1, "metadata.monitorPoint": 1, "metadata.date": 1 } );Indexes
A cluster of two nodes were created
¾CPU: Intel Xeon Quad core X5410.¾RAM: 16 GByte ¾SWAP: 16 GByte
OS:
¾RHEL 6.0 ¾2.6.32-279.14.1.el6.x86_64 MongoDB
¾V2.2.1 Real data from Sep-Nov of 2012 was used initially, but: A tool to generate random data was implemented:
¾ Month: 1 (February)
¾ Number of days: 11
¾ Number of antennas: 70
¾ Number of components by antenna: 41
¾ Monitoring points by component: 35
¾ Total daily documents: 100.450
¾ Total of documents: 1.104.950
¾ Average weight by document: 1,3MB
¾ Size of the collection: 1,375.23GB
¾ Total index size 193MB
Schema 1: One Sample of
Monitoring Data per Document
For more tests, see
https://adcwiki.alma.cl/bin/view/Software/HighVolu
meDataTestingUsingMongoDB
Test performance of aggregations/combined
queries
Use Map/Reduce to create statistics (max, min,
avg, etc) of range of data to improve performance
of queries like:
¾i.e: Search monitoring points which values >= 10
Test performance under a year worth of data
Stress tests with big amount of concurrent queries
MongoDB is suitable as an alternative for
permanent storage of monitoring data.
Reported 25,000 clobs/s ingestion rate in the tests.
The schema + indexes are fundamental to achieve
milliseconds level of responses
What are the requirements going to be like?
Only extraction by time interval and offline processing? Or also “data mining” running on the DB?
All queries ad-hoc and responsive, or also batch jobs? Repair / flagging of bad data? Later reduction of
redundancies?
Can we hide the MP-to-document mapping from
upserts/queries?
Currently queries have to patch together results at the 24 hour and monthly breaks.