• No results found

a year with a talk by Armin Ronacher for PyGrunn 2013

N/A
N/A
Protected

Academic year: 2021

Share "a year with a talk by Armin Ronacher for PyGrunn 2013"

Copied!
93
0
0

Loading.... (view fulltext now)

Full text

(1)

a year with

a talk by Armin '@mitsuhiko' Ronacher for PyGrunn 2013

mongoDB

(2)

That's me.

I do Computers.

Currently at Fireteam / Splash Damage.

(3)

I don't

like it

let's not beat

around the bush

(4)
(5)

“MongoDB is a pretty okay data store”

(6)

this is not a rant

it's our experience in a nutshell

we

nd corner cases

draw your own conclusions

(7)

“MongoDB is like a nuclear reactor: ensure proper

working conditions and it's perfectly safe and powerful.”

(8)
(9)

RAD Soldiers

(10)
(11)

RAD Soldiers API calls

21st 24th

oh  shit,  it's  vertical

MongoDB

Christmas 2012

(12)
(13)
(14)

MongoDB Overview

(15)

WHY?

(16)

Why the fuck

did we pick

(17)

Why the fuck

did we pick

MongoDB?

(18)

Why the fuck

did we pick

MongoDB?

schemaless

scalable

(19)

Why the fuck

did we pick

MongoDB?

schemaless

scalable

simple

(20)

schemaless

scalable

simple

(21)

schemaless

scalable

simple

json records

(22)

schemaless

scalable

simple

json records

auto sharding

think in records

(23)

schemaless is wrong

mongodb's sharding is annoying

thinking in records is hard

(24)

mongod

mongoc

mongos

(25)

mongod

mongoc

mongos

(26)

mongod

mongoc

mongos

mongods

mongocs

(27)

mongod

mongoc

mongos

mongods

mongocs

mongoses

(28)

mongod

mongoc

mongos

mongods

mongocs

mongoses

stores data

(29)

mongod

mongoc

mongos

mongods

mongocs

mongoses

stores data

says what's where

(30)

mongod

mongoc

mongos

mongods

mongocs

mongoses

stores data

says what's where

routes and merges

(31)

Many Moving Parts

mongod

mongoc

(32)

We Fail

(33)

workers on m1.small

most of the time in IO wait

no need for more CPU

(34)
(35)

worker setup

(36)

worker setup

nginx

uwsgi

mongos

mongod

uwsgi

uwsgi

(37)

worker setup

nginx

uwsgi

mongos

mongod

This  becomes  a  problem

uwsgi

uwsgi

(38)

T1

waits for

IO

(39)

worker

: mongos, give me data

mongos

: mongod, give me data

mongos

: worker, here is your data

worker

:

nally! mongos, now give me more data

context  switch

(40)

m1.medium: machines with

2 CPUs*

worker and mongos active at the

same time

what a novel idea

(41)
(42)

CPU Changes

mean  response  time

number  of  requests

(43)

EBS

(44)

Breaking your Instance 101

(45)

MongoDB's Execution Fails

(46)

No transactions

Document-level Operations

(47)
(48)

Expectation

• mongos fans out and proxies

• if mongos loses connection worker is good

(49)

Actual Result

• mongos fans out

• if mongos loses connection it

terminates both sides

• voluntary primary election

kills all connections

(50)

Tail-able Cursors

getLastError()

(51)
(52)

Replica Set Annoyances

1. Add Hidden Secondary

2. Witness it synchronizing

3. Take an existing secondary out

4. Actually unregister the secondary

5. Watch the whole cluster re-elect the same primary

and kill all active connections

(53)

Breaking

your Cluster

101

• add new primary

• remove old primary

• don't shutdown old primary

• network partitions and one of them overrides the

con

g of the other in the mongoc

(54)

MongoDB's Design Fails

(55)
(56)

Schema vs Schema-less is just a different version of

dynamic typing vs. static typing

(57)

static typing with an escape hatch to dynamic typing wins

(58)

we built an ADT based type system anyways

from fireline.schema import types username = types.String()

profile = types.Dynamic()

x = username.convert('mitsuhiko')

(59)
(60)

write oddity

write request

mongodb

GetLastError()

mongodb

(61)

performance fun

import os

from pymongo import Connection

safe = os.environ.get('MONGO_SAFE') == '1'

con = Connection() db = con['wtfmongo'] coll = db['test']

coll.remove()

for x in xrange(50000):

(62)

Disappointing

$ MONGO_SAFE=0 time python test.py

1.92 real 1.37 user 0.27 sys $ MONGO_SAFE=1 time python test.py

(63)

Disappointing

$ MONGO_SAFE=0 time python test.py

1.92 real 1.37 user 0.27 sys $ MONGO_SAFE=1 time python test.py

5.57 real 2.50 user 0.62 sys

(64)

that would not be a problem if safe mode was fast.

(65)

Lack of Joins

(66)

They will happen

1. Before we had joins, we did not have joins

2. not having joins is not a feature

(67)

RethinkDB has Distributed Joins :-)

r \

.table('marvel') \

.inner_join(r.table('dc'),

lambda m, dc: m['strength'] < dc['strength']) \

(68)

MongoDB does

not

have Map-Reduce

(69)

Inconsistent Queries

(70)

Oh got why!?

db.bios.find({

"awards": {"$elemMatch": { "award": "Turing Award", "year": { "$gt": 1980 } }}

})

(71)
(72)

Aggregation Framework comes with SQL Injection

db.zipcodes.aggregate({

"$group": {"_id": "$state",

"total_pop": {"$sum": "$pop"}} }, {

"$match": {"total_pop": {"$gte": 10 * 1000 * 1000}} })

(73)

Aggregation Framework comes with SQL Injection

db.zipcodes.aggregate({

"$group": {"_id": "$state",

"total_pop": {"$sum": "$pop"}} }, {

"$match": {"total_pop": {"$gte": 10 * 1000 * 1000}} })

(74)
(75)

They are important!

1. You will need them or you ha

ve inconsistent data

2. Everybody builds a two-phase commit system

(76)
(77)

MVCC is good for you

(78)

Shitty Index Selection

1. MongoDB picks secondary indexes automatically

2. It will also start using sparse indexes

3. It might not give you results back

4. Sometimes forcing ordering makes MongoDB use a

compound index

(79)

Limited Indexes

1. Given a compound index on [a, b]

2. {a: 1, b: 2} and {$and: [{a: 1}, {b: 2}]} are equivalent

3. Only the former picks up the compound index

4. Negations never use indexes

5. {$or: […]} is implemented as two parallel queries,

both clauses might need separate indexes.

(80)
(81)

Other Things of Note

(82)

Making Mongo not Suck (as much) on OS X

$ mongod --noprealloc --smallfiles --nojournal run

(83)

Windows

1. don't  use  ":"  in  collection  names

2. don't  use  "|"  in  collection  names

3. don't  use  "*"  in  collection  names

4. …⋯

(84)

Keys are huge. In our case

of the Data. Shorten them.

(85)

A MongoDB Cluster needs to boot in a certain Order

(86)
(87)
(88)
(89)

MongoDB is a pretty good data dump thing

it's not a SQL database

(90)

MongoDB is a pretty good data dump thing

it's not a SQL database

(91)

MongoDB is a pretty good data dump thing

it's not a SQL database

but you probably want a SQL database

at least until RethinkDB is ready

(92)

That's it.

Now ask questions.

And add me on twitter:

@mitsuhiko

(93)

Legal Shenanigans

Creative Common Sources for Images:

CPU by EssjazNZ: http://www.flickr.com/photos/essjay/4972875711/

Locks by katiejean97: http://www.flickr.com/photos/katiejean97/7036715845/

Money Money Money by Images_of_Money: http://www.flickr.com/photos/59937401@N07/5474168441/

Through any Window by Josep Ma. Rosell: http://www.flickr.com/photos/batega/1354354592/in/photostream/

RAD Soldiers is a Trademark of WarChest Limited. RAD Soldiers Artwork and Logo Copyright © 2013 by WarChest Limited. All Rights Reserved.

CPU by EssjazNZ: http://www.fl Locks by katiejean97: http://www.fl Money Money Money by Images_of_Money: http://www.fl Through any Window by Josep Ma. Rosell: http://www.fl

References

Related documents

Because we do not limit our service to just those who otherwise could not afford to travel to receive medical treatment and the cost involved in our operation. Gift of Life

This is because space itself is to function as the ‘form’ of the content of an outer intuition (a form of our sensi- bility), as something that ‘orders’ the ‘matter’

Against initial expectations that through the medium of shared faith and ethnicity, Indonesian migrants would share in the religious life of local Malay Muslim society, we thus found

For each text of the corpus &#34;More Like This&#34; the distances to the 221 reference texts are computed and the text is assigned to the group of its nearest neighbor (pseudo

Some projects were solely intended to promote IUDs, others gave an equal emphasis to implants as the main alternative long-acting reversible method (LARC) or to long-acting

– This leads to differentiation of the product which reduce the elasticity of demand and allows firms to withstand the competition in international markets.. • Role of intangibles

The results of the present study showed that ginsenoside Rb1 significantly decreased MCT‑induced EMP levels in EA.hy926 cells, suggesting that ginsenoside Rb1 may prevent

Within this PhD thesis, the Terrestrial Systems Modeling Platform – Parallel Data Assimila- tion Framework (TerrSysMP-PDAF) was extended to allow for assimilating gridded GRACE