(Quick Reference)
ElasticSearch Plugin - Reference Documentation
Authors: Manuarii Stein, Stephane Maldini, Serge P. Nekoval
Version: 0.18.7.1-SNAPSHOT
1 Introduction
The ElasticSearch plugin intends to implement a simple integration with Grails of the Open Source Search Engine
ElasticSearch,
which is based on Lucene and provide distributed capabilities.
The plugin focus on exposing Grails domain classes for the moment. It highly takes the existing
Searchable Plugin as
reference for its syntax and behavior.
Note that the plugin is still under development, so you may not be able to use all the features of ElasticSearch yet.
As for now, you should only use this plugin for testing purpose since you may lack some functionalities in a production environment.
In addition to this document, you may want to read the official ElasticSearch documentation
here.
1.1 Features
- Maps domain classes to their corresponding index in ElasticSearch
- Provides an ElasticSearch service for cross-domain searching
- Injects domain class methods for specific domain searching, indexing and unindexing
- Automatically mirrors any changes made throught Hibernate to the index
- Allow to use the Groovy Content Builder DSL for search queries
- Support for term highlighting
1.2 History
History
- February 01, 2012
- maintenance release 0.18.7.1-SNAPSHOT
- October 25, 2011
- March 25, 2011
- March 8, 2011
- December 16, 2010
Authors and Contributors
Manuarii Stein (doc4web consulting),
Stephane Maldini (doc4web consulting),
Serge P. Nekoval
Get the full and updated list of contributors on the
github repository.
Licence
Doc4web consulting and the contributors provide this plugin without any guarantees under Apache Software Licence 2.
Copyright 2002-2011 the original author or authors. Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Previous work
Graeme Rocher started the first draft which this plugin is based on.
2 Configuration
The plugin provide a default configuration, but you may add your own settings in your
Config.groovy script.
Client mode
You can set the plugin in 3 different modes, detailled on the
official ElasticSearch doc.
The mode is defined with the following config key:
elasticSearch.client.mode = '<mode>'
Possible values:
Value | Description |
---|
node | The plugin create its own node and join the ElasticSearch cluster as a client node (node.client = true ). This setting requires that you have an ElasticSearch instance running and available on your network (use the discovery feature) |
local | The plugin create its own local (to the JVM) node. Does not require any running ElasticSearch instance. Useful for development or testing. |
transport | The plugin create a transport client that will connect to a remote ElasticSearch instance without joining the cluster. |
"Transport" mode needs you to provide the host address and port. You can define one or multiple hosts with the following config key:
elasticSearch.client.hosts = [
[host:'192.168.0.3', port:9300],
[host:'228.168.0.4', port:9300]
]
If no host is defined,
localhost:9300
will be used by the transport client.
Others properties
elasticSearch.client.transport.sniff
Only usable in with a transport client.
Allows to sniff the rest of the cluster, and add those into its list of machines to use.
In this case, the ip addresses used will be the ones that the other nodes were started with (the “publish” address)
elasticSearch.cluster.name
The name of the cluster for the client to join.
elasticSearch.date.formats
List of date formats used by the JSON unmarshaller to parse any date field properly.
Note : future version of the plugin may change how formats are manipulated.
elasticSearch.defaultExcludedProperties
List of domain class properties to automatically ignore (will not be indexed) if their name match one of those.
This will only apply to default-mapped domain class, with the static
searchable
property set to "true", and will
not be considered when using closure mapping.
elasticSearch.disableAutoIndex
A boolean determining if the plugin should reflect any database save/update/delete automatically on the indices.
Default to
false
.
elasticSearch.bulkIndexOnStartup
A boolean determining if the application should launch a bulk index operation upon startup. Default to
true
.
elasticSearch.index.compound_format
Should the compound file format be used (boolean setting).
Set to
false
by default (really applicable for file system based index storage).
More details on this setting on the
ElasticSearch Documentation.
elasticSearch.index.store.type
Determine the way how the indices will be store.
More details on the possible values on the
ElasticSearch Documentation.
Possible value | Description |
---|
memory | Stores the index in memory. Useful for testing. |
mmapfs | Stores the shard index on the file system (maps to Lucene MMapDirectory) using mmap. |
niofs | Stores the shard index on the file system (maps to Lucene NIOFSDirectory) and allows for multiple threads to read from the same file concurrently. |
simplefs | Stores using a plain forward implementation of file system storage (maps to Lucene SimpleFsDirectory) using random access file. |
elasticSearch.maxBulkRequest
Max number of requests to process at once.
Reduce this value if you have memory issue when indexing a big amount of data at once.
If this setting is not specified, 500 will be use by default.
The location of the data files of each index / shard allocated on the node.
Default configuration script
Below is the default configuration loaded by the plugin (any of your settings in the Config.groovy script overwrite those).
elasticSearch {
/**
* Date formats used by the unmarshaller of the JSON responses
*/
date.formats = ["yyyy-MM-dd'T'HH:mm:ss'Z'"] /**
* Hosts for remote ElasticSearch instances.
* Will only be used with the "transport" client mode.
* If the client mode is set to "transport" and no hosts are defined, ["localhost", 9300] will be used by default.
*/
client.hosts = [
[host:'localhost', port:9300]
] /**
* Default mapping property exclusions
*
* No properties matching the given names will be mapped by default
* ie, when using "searchable = true"
*
* This does not apply for classes using mapping by closure
*/
defaultExcludedProperties = ["password"] /**
* Determines if the plugin should reflect any database save/update/delete automatically
* on the ES instance. Default to false.
*/
disableAutoIndex = false /**
* Should the database be indexed at startup.
*
* The value may be a boolean true|false.
* Indexing is always asynchronous (compared to Searchable plugin) and executed after BootStrap.groovy.
*/
bulkIndexOnStartup = true /**
* Max number of requests to process at once. Reduce this value if you have memory issue when indexing a big amount of data
* at once. If this setting is not specified, 500 will be use by default.
*/
maxBulkRequest = 500
}environments {
development {
/**
* Possible values : "local", "node", "transport"
* If set to null, "node" mode is used by default.
*/
elasticSearch.client.mode = 'local'
}
test {
elasticSearch {
client.mode = 'local'
index.store.type = 'memory' // store local node in memory and not on disk
}
}
production {
elasticSearch.client.mode = 'node'
}
}
3 Mapping
3.1 QuickStart
Default mapping
To declare a domain class to be searchable, the simpliest way is to define the following static property in the code:
The plugin will generate a default mapping for each properties of the domain.
Custom mapping
You can customize how each properties are mapped to the index using a closure. The syntax is similar to GORM's mapping DSL.
static searchable = {
// mapping DSL…
}
See below for more details on the mapping DSL.
Limit properties with only/except
only
and
except
are used to limit the properties that are made searchable.
You may not define both except & only settings at the same time.
The following code will only map the 'message' property, any others will be ignored.
class Tweet {
static searchable = {
only = 'message'
}
String message
String someUselessField
}
The following code will map all properties except the one specified.
class Tweet {
static searchable = {
except = 'someUselessField'
}
String message
String someUselessField
}
You can use a Collection to specify several properties.
class Tweet {
static searchable = {
except = ['someUselessField', 'userName']
}
String message
String userName
String someUselessField
}
The properties that are ignored will not be sent to ElasticSearch. It also means that when you will get back a domain
from ElasticSearch, some fields that are not supposed to be null, may still be null.
3.2 Class mapping
root
Determine if the domain class will have its own index or not. Take a boolean as parameter, and is set to
true
by default.
class Preference {
static searchable = {
root false
}
// …
}class Tag {
static searchable = true
// …
}class Tweet {
static searchable = {
message boost:2.0
}
// …
}
In this code, the classes
Tweet
and
Tag
are going to have their own index. The class
Preference
will not.
It also mean that any search request will never return a Preference-type hit. The dynamic method
search
will not be
injected in the
Preference
domain class.
The domains not root-mapped can still be considered searchable, as they can be components of another domain which is root-mapped.
For example, considered the following domain:
class User {
static searchable = {
userPreferences component:true
} Preference userPreferences
}
When searching, any matches in the
userPreferences
property will be considered as a
User
match.
3.3 Properties mapping
You can customize the mapping for each domain properties using the closure mapping.
The syntax is simple:
static searchable = {
propertyName option1:value, option2:value, …
}
Available options
Option name | Description |
---|
boost | A decimal boost value. With a positive value, promotes search results for hits in this property; with a negative value, demotes search results that hit this property. |
component | To use only on domain (or collection of domains), make the property a searchable component. |
converter | A Class to use as a converter during the marshalling/unmarshalling process for that peculiar property. That class must extends the PropertyEditorSupport java class. |
excludeFromAll | A boolean, determines if the property is to append in the "_all" field. Default to true . |
index | How or if the property is made into searchable element. One of "no" , "not_analyzed" or "analyzed" . |
reference | To use only on domain (or collection of domains), make the property a searchable reference. |
3.4 Searchable Component-Reference
The plugin support a similar searchable-component & searchable-reference behavior from Compass
when you are dealing with domain association.
See below to find out about the difference between both mapping modes.
3.4.1 Searchable Reference
The searchable-reference mapping mode is the default mode used for association, and requires the
searchable class of the association to be root-mapped in order to have its own index.
With this mode, the associated domains are not completely marshalled in the resulting JSON document:
only the id and the type of the instances are kept.
When the document is retrieved from the index, the plugin will automatically rebuild the association from the
indices using the stored id.
Example
class MyDomain {
// odom is an association with the OtherDomain class, set as a reference
OtherDomain odom static searchable = {
odom reference:true
}
}// The OtherDomain definition, with default searchable configuration
class OtherDomain {
static searchable = true String field1 = "val1"
String field2 = "val2"
String field3 = "val3"
String field4 = "val4"
}
When indexing an instance of MyDomain, the resulting JSON documents will be sent to ElasticSearch:
{
"mydomain": {
"_id":1,
"odom": { "id":1 }
}
}{
"otherdomain": {
"_id":1,
"field1":"val1",
"field2":"val2",
"field3":"val3",
"field4":"val4"
}
}
3.4.2 Searchable Component
The searchable-component mapping mode must be explicitly set, and does not require the
searchable class of the association to be root-mapped.
With this mode, the associated domains are nested in the parent document.
Example
class MyDomain {
// odom is an association with the OtherDomain class, set as a reference
OtherDomain odom static searchable = {
odom component:true
}
}// The OtherDomain definition, with default searchable configuration
class OtherDomain {
static searchable = true String field1 = "val1"
String field2 = "val2"
String field3 = "val3"
String field4 = "val4"
}
When indexing an instance of MyDomain, the resulting JSON document will be sent to ElasticSearch:
{
"mydomain": {
"_id":1,
"odom": {
"_id":1,
"field1":"val1",
"field2":"val2",
"field3":"val3",
"field4":"val4"
}
}
}
4 Indexing
With its default configuration (with the
disableAutoIndex
configuration key set to
false
), the plugin is indexing
automatically any searchable domains when GORM/Hibernate do a save or an update in the database.
It also delete automatically from the index any document corresponding to a domain that is deleted from the database.
You normally shouldn't have to worry about indexing, but sometimes you may have to do it by yourself, for example on dirty
domain object that you may not want to save right now.
The plugin is providing a few injected methods in the domain or in the
ElasticSearchService
to allow that.
Index examples
// Index all searchable instances
elasticSearchService.index()// Index a specific domain instance
MyDomain md = new MyDomain(value:'that')
md.save()
elasticSearchService.index(md)// Index a collection of domain instances
def ds = [new MyDomain(value:'that'), new MyOtherDomain(name:'this'), new MyDomain(value:'thatagain')]
ds*.save()
elasticSearchService.index(ds)// Index all instances of the specified domain class
elasticSearchService.index(MyDomain)
elasticSearchService.index(class:MyDomain)
elasticSearchService.index(MyDomain, MyOtherDomain)
elasticSearchService.index([MyDomain, MyOtherDomain])
Unindex examples
// Unindex all searchable instances
elasticSearchService.unindex()// Unindex a specific domain instance
MyDomain md = new MyDomain(value:'that')
md.save()
elasticSearchService.unindex(md)// Unindex a collection of domain instances
def ds = [new MyDomain(value:'that'), new MyOtherDomain(name:'this'), new MyDomain(value:'thatagain')]
ds*.save()
elasticSearchService.unindex(ds)// Unindex all instances of the specified domain class
elasticSearchService.unindex(MyDomain)
elasticSearchService.unindex(class:MyDomain)
elasticSearchService.unindex(MyDomain, MyOtherDomain)
elasticSearchService.unindex([MyDomain, MyOtherDomain])
5 Searching
The plugin provides 2 ways to send search requests.
- You can use the
elasticSearchService
and its public search
method for cross-domain searching, meaning that ElasticSearch
may analyze multiple indices and return hits of different types (=different domains).
def res = elasticSearchService.search("${params.query}")
// 'res' search results may contains multiple types of results
- You can use the injected dynamic method in the domain for domain-specific searching.
def res = Tweet.search("${params.query}")
// 'res' search results contains only Tweet instances
These search methods return a
Map
containing 3 entries:
- a
total
entry, representing the total number of hits found
- a
searchResults
entry, containing the hits
- a
scores
entry, containing the hits scores
Example
def res = Tweet.search("${params.query}")
println "Found ${res.total} result(s)"
res.searchResults.each {
println it.message
}def res = elasticSearchService.search("${params.query}")
println "Found ${res.total} result(s)"
res.searchResults.each {
if(it instanceof Tweet) {
println it.message
} else {
println it.toString()
}
}
If you're willing to retrieve only the number of hits for a peculiar query, you can use the
countHits()
method. It will only return an
Integer
representing the total hits matching your query.
Example
def res = Tweet.countHits("${params.query}")
println "Found ${res} result(s)"def res = elasticSearchService.countHits("${params.query}", [indices:'test'])
println "Found ${res} result(s)"
5.1 Query Strings
The search method injected in the domain or the
ElasticSearchService
has multiple signatures available.
You can pass it a simple
String
to compute your search request. That string will be parsed by the
Lucene query parser
so feel free to use its syntax to do more specific search query.
You can find out about the syntax on the
Apache Lucene website.
Example
def results = elasticSearchService.search("${params.query}")
def resultsTweets = Tweet.search("message:${params.query}")
5.2 Query Closure
You can use the
Groovy Query DSL to build your search query as a
Closure
.
The format of the search
Closure
follow the same JSON syntax as the
ElasticSearch REST API
and the
Java Query DSL.
Example
def result = elasticSearchService.search(searchType:'dfs_query_and_fetch') {
bool {
must {
query_string(query: params.query)
}
if (params.firstname) {
must {
term(firstname: params.firstname)
}
}
}
}
5.3 Highlighting
The search method support highlighting: automatic wrapping of the matching terms in the search results with
HTML/XML/Whatever tags.
You can activate this with a
Closure
containing the highlight settings in the search method
highlight
parameter.
The format of the
Closure
for defining the highlight settings is the same as the
ElasticSearch REST API.
Example
// Define the pre & post tag that will wrap each term matched in the document.
def highlighter = {
field 'message'
field 'tags.name'
preTags '<strong>'
postTags '</strong>'
}def results = Tweet.search("${params.query}", [highlight: highlightSettings])
Highlight results
If a search result is found, the
search
method will add a
highlight
entry in the map result.
That entry contains a
List
with every highlighted fragments/fields found for each hit.
def results = Tweet.search("${params.query}", [highlight: { field 'message' }])
def highlighted = results.highlightresults?.searchResults?.eachWithIndex { hit, index ->
// Retrieve the 'message' field fragments for the current hits
def fragments = highlighted[index].message?.fragments // Print the fragment
println fragments?.size() ? fragments[0] : ''
}
Highlighted fields
To determine which fields are to be processed by ElasticSearch, use the
field
setting.
You can call the
field
setting as many time as you want to add any field.
Signature
field <fieldName>[, <fragmentSize>[, <numberOfFragment>]]
Examples
def highlightSettings = {
field 'message' // Add the 'message' field in the highlighted fields list
field 'tags.name' // Add the 'name' field contained in the 'tags' field of
// the document in the highlighted fields list
field 'thatAwesomeField', 0, 20 // Add the 'thatAwesomeField' field with
// some values fixed for fragmentSize and numberOfFragment parameters
}def highlightSettings2 = {
field '_all' // Add the special '_all' field in the highlighted fields list
}def results = Tweet.search("${params.query}", [highlight: highlightSettings])
def results2 = Tweet.search("${params.query}", [highlight: highlightSettings2])
Highlighting tags
By default, ElasticSearch will use emphasis tag "
<em>...</em>
" to wrap the matching text.
You can customize the tags with the
preTags
and
postTags
settings.
def highlightSettings = {
field 'message'
preTags '<myAweSomeTag>'
postTags '</myAweSomeTag>'
}
6 Admin
The plugin implements a few convenience methods for a few admin-oriented actions.
6.1 Refresh
Explicitly refresh one or more index, making all operations performed since the last refresh available for search.
It will also flush the current IndexRequestQueue if there are pending index or delete requests from the application side.
The refresh method is not asynchronous, meaning that it will wait for all operations to complete before resuming the execution of your application.
elasticSearchService.index(domain)
// Some code…
// …
elasticSearchService.index(domain2)
// Some code…
// …
elasticSearchService.index(domain3)// Some code…
// …
elasticSearchAdminService.refresh() // Ensure that the 3 previous index requests have been made searchable by ES
6.2 Delete Index
Delete an index, all its mapping and its content from the ElasticSearch instance. Be careful when using this command because it cannot be undone.
Note that the generated mapping from the grails plugin is also deleted.
The method can be limited to one or more specific indices or applied to all indices at once (called with no parameter).
elasticSearchAdminService.deleteIndex()
7 Low level API
If you need to use the Elastic Search client directly, you can use the
elasticSearchHelper
bean that is
injected in any services/controllers to get the current instance.
Simply encapsulate your code within a
withElasticSearch
bloc,
and you will get a
org.elasticsearch.client.Client
implementation to play with.
class MySearchService {
static transactional = true def elasticSearchHelper def myMethod(indexName, settings) {
elasticSearchHelper.withElasticSearch { client ->
// Do some stuff with the ElasticSearch client
client.admin()
.indices()
.prepareCreate(indexName)
.setSettings(settings)
.execute()
.actionGet()
}
}
}
Please refers to the
Elastic Search API for more
information on the methods and properties available on the client.
8 Example
The domains
class Tweet {
static searchable = {
message boost:2.0
} static belongsTo = [
user:User
] static hasMany = [
tags:Tag
] static constraints = {
tags nullable:true, cascade:'save, update'
} String message = ''
Date dateCreated = new Date()
}
class User {
static searchable = {
except = 'password'
lastname boost:20
firstname boost:15, index:'not_analyzed'
listOfThings index:'no'
someThings index:'no'
tweets component:true
} static constraints = {
tweets cascade:'all'
}
static hasMany = [
tweets:Tweet
]
static mappedBy = [
tweets:'user'
] String lastname
String firstname
String password
String activity = 'Evildoer'
String someThings = 'something'
ArrayList<String> listOfThings = ['this', 'that', 'andthis']
}
class Tag {
static searchable = {
except=['boostValue']
} String name
Integer boostValue = 1
}
The controller
A action triggering indexation
ElasticSearchController
(
testCaseService
is just dealing with GORM instructions):
class ElasticSearchController {
def elasticSearchService
def testCaseService def postTweet = {
if(!params.user?.id) {
flash.notice = "No user selected."
redirect(action: 'index')
return
}
User u = User.get(params.user.id)
if (!u) {
flash.notice = "User not found"
redirect(action: 'index')
return
}
// Create tweet
testCaseService.addTweet(params.tweet?.message, u, params.tags) flash.notice = "Tweet posted"
redirect(action: 'index')
}
}
With this code (considering that there are already
User
in the database), new Tweets will be indexed automatically,
and corresponding
User
indexed documents will be updated since we have set the
tweets
association as component.
Searching for Tweets
def searchForUserTweets = {
def tweets = Tweet.search("${params.message.search}").searchResults
def tweetsMsg = 'Messages : '
tweets.each {
tweetsMsg += "<br />Tweet from ${it.user?.firstname} ${it.user?.lastname} : ${it.message} "
tweetsMsg += "(tags : ${it.tags?.collect{t -> t.name}})"
}
flash.notice = tweetsMsg
redirect(action: 'index')
}
Searching for anything
def searchAll = {
def res = elasticSearchService.search("${params.query}").searchResults
def resMsg = '<strong>Global search result(s):</strong><br />'
res.each {
switch(it){
case Tag:
resMsg += "<strong>Tag</strong> ${it.name}<br />"
break
case Tweet:
resMsg += "<strong>Tweet</strong> "${it.message}" from ${it.user.firstname} ${it.user.lastname}<br />"
break
case User:
resMsg += "<strong>User</strong> ${it.firstname} ${it.lastname}<br />"
break
default:
resMsg += "<strong>Other</strong> ${it}<br />"
break
} }
flash.notice = resMsg
redirect(action:'index')
}