TriJUG.2013-01-21.v1 - the Triangle Java Users Group
Download
Report
Transcript TriJUG.2013-01-21.v1 - the Triangle Java Users Group
MongoDB and Spring Data
Prepared for:
THE TRIANLGE JAVA USER’S GROUP
January 21st, 2013
icfi.com |
1
ICF IRONWORKS
Integrated Services
Interactive
Developing creative ideas and engaging audiences through
Web, Mobile, and Social Media
• Social Media +
• User + Industry Research
Monitoring
• Web Analytics
• Digital Strategy + Planning
• Mobile Strategy + Execution
• Search Marketing
• Information
Architecture + Usability
• Creative Design
• Rich Media Development
Interactive
Portal + Content Management
Building Internet-based systems to share content, knowledge,
and data
• Enterprise Content
• Custom Application
Development
•
• Systems Integration
• Portal
Business +
IT Alignment
Portal +
Content
Management
Management
• Search
• Cloud Services
• E-Commerce
• Application + Platform
Management
Business + IT Alignment
Developing practical strategies to help clients improve
business performance
• Management
• Business Process
Improvement
• IT Strategy and Roadmap
• Governance
• Technology Selection
• Business Intelligence
• Program + Portfolio
icfi.com |
2
ICF IRONWORKS
Partnerships and Platform Expertise
ICF Ironworks has experience in the
following market-leading platforms:
• Microsoft
• Ektron
• Autonomy Interwoven
• Oracle UCM and WebLogic Portal
• Alfresco
• SiteCore
• Percussion
• IBM WebSphere
We leverage our strategic partnerships
to enhance the services we provide to
our clients and to build on our sales
pipeline
ICF Ironworks is one of 34 Microsoft
National Systems Integrators (NSI)
icfi.com |
3
ICF IRONWORKS
Healthcare
Mfg/Retail/Distribution
Non-Profit/Assn
Financial
icfi.com |
Government
Energy
4
Who Am I?
Solutions Architect with ICF Ironworks
Part-time Adjunct Professor
Started with HTML and Lotus Notes in 1992
• In the interim there was C, C++, VB, Lotus Script, PERL, LabVIEW,
etc.
Not so much an Early Adopter as much as a Fast Follower of Java
Technologies
Alphabet Soup (MCSE, ICAAD, ICASA, SCJP, SCJD, PMP, CSM)
LinkedIn: http://www.linkedin.com/in/iamjimmyray
Blog: http://jimmyraywv.blogspot.com/ Avoiding Tech-sand
icfi.com |
5
MongoDB and Spring Data
icfi.com |
6
Tonight’s Agenda
Quick introduction to NoSQL and MongoDB
• Configuration
• MongoView
Introduction to Spring Data and MongoDB support
• Spring Data and MongoDB configuration
• Templates
• Repositories
• Query Method Conventions
• Custom Finders
• Customizing Repositories
•
•
•
•
icfi.com |
Metadata Mapping (including nested docs and DBRef)
Aggregation Functions
GridFS File Storage
Indexes
7
What is NoSQL?
Official: Not Only SQL
• In reality, it may or may not use SQL*, at least in its truest form
• Varies from the traditional RDBMS approach of the last few decades
• Not necessarily a replacement for RDBMS; more of a solution for more
specific needs where is RDBMS is not a great fit
• Content Management (including CDNs), document storage, object storage,
graph, etc.
It means different things to different folks.
• It really comes down to a different way to view our data domains for
more effective storage, retrieval, and analysis
icfi.com |
8
From NoSQL-Database.org
“NoSQL DEFINITION: Next Generation Databases mostly
addressing some of the points: being non-relational, distributed,
open-source and horizontally scalable. The original intention has
been modern web-scale databases. The movement began early
2009 and is growing rapidly. Often more characteristics apply such
as: schema-free, easy replication support, simple API, eventually
consistent / BASE (not ACID), a huge amount of data and more.”
icfi.com |
9
Some NoSQL Flavors
Document Centric
• MongoDB
• Couchbase
Wide Column/Column
Families
• Cassandra
• Hadoop Hbase
Key/Value Stores
• Redis
Object
• DB4O
Other
• LotusNotes/Domino
XML
• MarkLogic
Graph
• Neo4J
icfi.com |
10
Why MongoDB
Open Source (written in C++)
Multiple platforms (Linux, Win, Solaris, Apple) and Language Drivers
Explicitly de-normalized
Document-centric and Schema-less
Fast (low latency)
• Fast access to data
• Low CPU overhead
Ease of scalability (replica sets), auto-sharding
Manages complex and polymorphic data
Great for CDN and document-based SOA solutions
Great for location-based and geospatial data solutions
icfi.com |
11
Why MongoDB (more)
Because of schema-less approach is more flexible, MongoDB is
intrinsically ready for iterative (Agile) projects.
Eliminates “impedance-mismatching” with typical RDBMS solutions
“How do I model my application in 3NF?”
If You are already familiar with JavaScript and JSON, this is an easy
database to understand.
icfi.com |
12
What is schema-less?
A.K.A. schema-free
It means that MongoDB does not enforce a column data type on
the fields within your document, nor does it confine your document
to specific columns defined in a table definition.
The schema is actually controlled via the application API layers
and is implied by the “shape” (content) of your documents.
This means that different documents in the same collection can
have different fields.
• So the schema is flexible in that way
• Only the _id field is mandatory in all documents.
Requires more rigor on the application side.
icfi.com |
13
Why Not MongoDB
High speed and deterministic transactions:
• Banking and accounting
Where SQL is absolutely required
• Where Joins are needed
Traditional non-real-time data warehousing ops
If your organization lacks the controls and rigor to place schema
and document definition at the application level without
compromising data integrity
icfi.com |
14
MongoDB
Was designed to overcome some of the performance
shortcomings of RDBMS
Some Features
•
•
•
•
•
Fast Querying (atomic operations, embedded data)
In place updates (physical writes lag in-memory changes)
Full Index support (including compound indexes)
Replication/High Availability (see CAP Theorem)
Auto Sharding (range-based portioning, based on shard key) for
scalability
• Aggregation, MapReduce
• GridFS
icfi.com |
15
MongoDB – In Place Updates
Physical disk writes lag in-memory changes.
• Multiple writes in memory can occur before the object is updated on
disk
MongoDB uses an adaptive allocation algorithm for storing its
objects.
• If an object changes and fits in it’s current location, it stays there.
• However, if it is now larger, it is moved to a new location. This moving
is expensive for index updates
• MongoDB looks at collections and based on how many times items
grow within a collection, MongoDB calculates a padding factor that trys
to account for object growth
• This minimizes object relocation
icfi.com |
16
MongoDB – A Word About Sharding…
Need to choose the right key
• Easily divisible (“splittable”– see cardinality) so that Mongo can
distribute data among shards
• “all documents that have the same value in the state field must reside on the
same shard” – 10Gen
• Enable distributed write operations between cluster nodes
• Prevents single-shard bottle-necking
• Make it possible for “Mongos” return most query operations from a
single mongod instance
• “users will generally have a unique value for this field, MongoDB will be able
to split as many chunks as needed” – 10Gen
icfi.com |
17
MongoDB – Cardinality…
You want higher cardinality to allow chunks of data to be split
among shards
• Example: Address data components
• State – Low Cardinality
• ZipCode – Potentially low or high, depending population
• Phone Number – High Cardinality
icfi.com |
18
CAP Theorem
Consistency
Availability
Partition Tolerance (network partition tolerance)
You can never have all three, so you plan for two and make the
best of the third.
• For example: Perhaps “eventual consistency” is OK for a CDN
application.
• For large scalability, you would need partitioning. That leaves C & A to
choose from
• Would you ever choose consistency over availability?
icfi.com |
19
Container Models: RDBMS vs. MongoDB
RDBMS: Servers > Databases > Schemas > Tables > Rows
• Joins, Group By, ACID
MongoDB: Servers > Databases > Collections > Documents
• No Joins
• Instead: Db References (Linking) and Nested Documents (Embedding)
icfi.com |
20
MongoDB Collections
Schema-less
Can have up to 24000 (according to 10gen)
• Cheap to resource
Contain documents (…of varying shapes)
• 100 nesting levels (version 2.2)
Are namespaces, like indexes
Can be “Capped”
• Limited in max size with rotating overwrites of oldest entries
• Example: oplog
icfi.com |
21
MongoDB Documents
JSON (what you see)
• Actually BSON (Internal - Binary JSON - http://bsonspec.org/)
Elements are name/value pairs
16 MB maximum size
What you see is what is stored
• No default fields (columns)
icfi.com |
22
Why BSON?
Adds data types that JSON did not support
Optimized for performance
Adds compression
icfi.com |
23
MongoDB Install
Extract MongoDB
Build config file, or use startup script
• Need dbpath configured
• Need REST configured for Web Admin tool
Start Mongod (daemon) process
Use Shell (mongo) to access your database
Use MongoVUE for GUI access and to learn shell commands
icfi.com |
24
Mongo Shell
In Windows, mongo.exe
Command-line interface to MongoDB (sort of like SQL*Plus for
Oracle)
icfi.com |
25
MongoVUE
GUI around MongoDB Shell
Makes it easy to learn MongoDB Shell commands
• db.employee.find({ "lastName" : "Smith", "firstName" : "John"
}).limit(50);
• show collections
Demo…
icfi.com |
26
Web Admin Interface
Localhost:28017
Quick stats viewer
Run commands
Demo
There is also Sleepy Mongoose
• http://www.kchodorow.com/blog/2010/02/22/sleepy-mongoose-amongodb-rest-interface/
icfi.com |
27
Spring Data
Large Spring project with many subprojects
• Category: Document Stores, Subproject MongoDB
“…aims to provide a familiar and consistent Spring-based
programming model…”
Like other Spring projects, Data is POJO Oriented
For MongoDB, provides high-level API and access to low-level API
for managing MongoDB documents.
Provides annotation-driven meta-mapping
Will allow you into bowels of API if you choose to hang out there
icfi.com |
28
Spring Data MongoDB Templates
Implements MongoOperations (mongoOps) interface
• mongoOps defines the basic set of MongoDB operations for the Spring
Data API.
• Wraps the lower-level MongoDB API
Provides access to the lower-level API
Provides foundation for upper-level Repository API.
icfi.com |
29
Spring Data MongoDB Templates - Configuration
See mongo-config.xml
icfi.com |
30
Spring Data MongoDB Templates - Configuration
Or…see the config class
icfi.com |
31
Spring Data Repositories
Convenience for data access
• Spring does ALL the work (unless you customize)
Convention over configuration
• Uses a method-naming convention that Spring interprets during
implementation
Hides complexities of Spring Data templates and underlying API
Builds implementation for you based on interface design
• Implementation is built during Spring container load.
Is typed (parameterized via generics) to the model objects you
want to store.
• When extending MongoRepository
• Otherwise uses @RepositoryDefinition annotation
icfi.com |
32
Spring Data Meta Mapping
Annotation-driven mapping of model object fields to Spring Data
elements in specific database dialect.
icfi.com |
33
MongoDB DBRef
Optional
Instead of nesting documents
Have to save the “referenced” document first, so that DBRef exists
before adding it to the “parent” document
icfi.com |
34
MongoDB Custom Spring Data Repositories
Hooks into Spring Data bean type hierarchy that allows you to add
functionality to repositories
Important: You must write the implementation for part of this
custom repository
And…your Spring Data repository interface must extend this
custom interface, along with the appropriate Spring Data repository
Demo
icfi.com |
35
Creating a Custom Repository
Write an interface for the custom methods
Write the implementation for that interface
Write the traditional Spring Data Repository application interface,
extending the appropriate Spring Data interface and the (above)
custom interface
When Spring starts, it will implement the Spring Data Repository
normally, and include the custom implementation as well.
icfi.com |
36
MongoDB Advanced Queries
http://www.mongodb.org/display/DOCS/Advanced+Queries#Advan
cedQueries-%24all
Demo - $in, $nin, $gt, $all
icfi.com |
37
MongoDB Aggregation Functions
Aggregation Framework
Map/Reduce
Distinct - Demo
Group - Demo
• Similar to SQL Group By function
Count
icfi.com |
38
MongoDB GridFS
“…specification for storing large files in MongoDB.”
As the name implies, “Grid” allows the storage of very large files
divided across multiple MongoDB documents.
• Uses native BSON binary formats
16MB per document
• Will be higher in future
Large files added to GridFS get chunked and spread across
multiple documents.
icfi.com |
39
MongoDB Indexes
Similar to RDBMS Indexes
Can have many
Can be compound
• Including indexes of array fields in document
Makes searches, aggregates, and group functions faster
Makes writes slower
Sparse = true
• Only include documents in this index that actually contain a value in the
indexed field.
icfi.com |
40
MongoDB Security
http://www.mongodb.org/display/DOCS/Security+and+Authenticati
on
Default is trusted mode, no security
--auth
--keyfile
• Replica sets require this option
icfi.com |
41
MongoDB Encryption
MongoDB does not support data encryption, per se
Use application-level encryption and store encrypted data in BSON
fields
Or…use TDE (Transparent Data Encryption) from Gazzang
• http://www.gazzang.com/encrypt-mongodb
icfi.com |
42
MongoDB 2.2
Drop-in replacement for 1.8 and 2.0.x
Aggregation without Map Reduce
TTL Collections (alternative to Capped Collections)
Tag-aware Sharding
http://docs.mongodb.org/manual/release-notes/2.2/
icfi.com |
43
Helpful Links
Spring Data MongoDB - Reference Documentation:
http://static.springsource.org/spring-data/datamongodb/docs/1.0.2.RELEASE/reference/html/
http://nosql-database.org/
www.mongodb.org
http://www.mongodb.org/display/DOCS/Java+Language+Center
http://www.mongodb.org/display/DOCS/Books
http://openmymind.net/2011/3/28/The-Little-MongoDB-Book/
http://jimmyraywv.blogspot.com/2012/05/mongodb-and-spring-data.html
http://jimmyraywv.blogspot.com/2012/04/mongodb-jongo-and-morphia.html
https://www.10gen.com/presentations/webinar/online-conference-deep-divemongodb
icfi.com |
44
Questions
icfi.com |
45