Django Haystack and Elasticsearch
Hello! Today blog post is about Django Haystack and how to integrate it quickly with Elasticsearch.
First after creating django project (At beginning of 2016 django-haystack don’t work properly with django 1.9 so I used 1.8.9 version) and making new app let’s add models:
from django.db import models
GENDER_CHOICES = (
('Male', 'Male'),
('Female', 'Female')
)
class Person(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
gender = models.CharField(max_length=10, choices=GENDER_CHOICES)
email = models.CharField(max_length=100)
ip_address = models.CharField(max_length=100)
def __str__(self):
return '{first_name} {last_name}'.format(first_name=self.first_name, last_name=self.last_name)
And register model to the admin site. Don’t forget about adding created
app to settings.py and making manage.py makemigrations
and
manage.py migrate
after it:
from django.contrib import admin
from .models import Person
admin.site.register(Person)
Then create simple script wich will load a data from JSON to the
database. This JSON data is randomly generated data from this
webpage. Call it load.py
and place in
your django application folder.
# coding=utf-8
import os
import json
from .models import Person
DATA_FILE = os.path.join(
os.path.dirname(
os.path.dirname(
os.path.dirname(__file__))),
'MOCK_DATA.json'
)
def run(verbose=True):
with open(DATA_FILE) as data_file:
data = json.load(data_file)
for record in data:
Person.objects.create(
first_name=record['first_name'],
last_name=record['last_name'],
gender=record['gender'],
email=record['email'],
ip_address=record['ip_address'])
print(record)
This script looks for file MOCK_DATA.json
. Then based on fields on
this JSON loads data to the django application. You can run this by
manage.py shell
and then:
>>> from django_app import load
>>> load.run()
{'ip_address': '86.24.99.139', 'gender': 'Female', 'first_name': 'Christine', 'last_name': 'Cunningham', 'email': 'ccunninghamrq@howstuffworks.com'}
{'ip_address': '250.20.255.181', 'gender': 'Male', 'first_name': 'Scott', 'last_name': 'Hanson', 'email': 'shansonrr@utexas.edu'}
# rest of the records
Now it’s time to install and elasticsearch. On ubuntu you can do it as follows:
1.First install java-8
$ sudo apt-get install python-software-properties -y
$ sudo add-apt-repository ppa:webupd8team/java -y
$ sudo apt-get update
$ sudo apt-get install oracle-java8-installer -y
2.Verify if it’s properly installed
$ java -version
java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)
3.Now install elasticsearch itself
$ wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
$ echo "deb http://packages.elastic.co/elasticsearch/1.7.5/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elk.list
$ sudo apt-get update && sudo apt-get install elasticsearch -y
$ sudo service elasticsearch start
4.Verify if elasticsearch is running
$ curl http://localhost:9200
{
"status" : 200,
"name" : "May \"Mayday\" Parker",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "1.7.5",
"build_hash" : "00f95f4ffca6de89d68b7ccaf80d148f1f70e4d4",
"build_timestamp" : "2016-02-02T09:55:30Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}
Now it’s time to install to more python packages:
$ pip install django-haystack==2.4.1
$ pip install elasticsearch
After adding them to INSTALLED_APPS
:
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'haystack',
'persons'
]
and setup up connection in settings.py
:
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': 'localhost:9200',
'INDEX_NAME': 'haystack',
},
}
Create file called search_indexes.py
in your django application folder
(django_project/django_app/search_indexes.py
):
from haystack import indexes
from .models import Person
class PersonIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
first_name = indexes.CharField(model_attr='first_name')
last_name = indexes.CharField(model_attr='last_name')
gender = indexes.CharField(model_attr='gender')
email = indexes.CharField(model_attr='email')
ip_address = indexes.CharField(model_attr='ip_address')
def get_model(self):
return Person
In this file we declare indexes with will be created in elasticsearch.
The first field text
indicates which field is primary to be searched
within. This field can be named wherever you wanted but the convention
is to name it text
. There is only one field in each index with
document=True
argument. Another argument use_template=True
tells
haystack to use a template for building document for an index. This
document is usually located under
django_project/templates/search/indexes/django_app/index_name.txt
. And
for this data looks like this:
{{ object.title }}
{{ object.first_name }}
{{ object.last_name }}
{{ object.gender }}
{{ object.email }}
{{ object.ip_address }}
Don’t forget to add this django_project/templates/
to TEMPLATES
in
settings.py:
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [os.path.join(BASE_DIR,'templates/'),],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]
After this add haystack.urls
to urls.py:
urlpatterns = [
url(r'^admin/', admin.site.urls),
url(r'^search/', include('haystack.urls'))
]
Now it’s time to create search.html
in
django_project/templates/search/search.html
:
{% extends 'base.html' %}
{% block content %}
<h2>Person search</h2>
<form method="get" action="." class="form" role="form">
{{ form.non_field_errors }}
<div class="form-group">
{{ form.as_p }}
</div>
<div class="form-group">
<input type="submit" class="btn btn-primary" value="Search">
</div>
{% if query %}
<h3>Results</h3>
<div>
<table class="table table-striped table-bordered" cellspacing="0" id='result_table'>
<thead>
<tr>
<th>First name</th>
<th>Last name</th>
<th>Gender</th>
<th>Email</th>
<th>IP address</th>
</thead>
<tbody>
{% for result in page.object_list %}
<tr>
<td>{{ result.first_name }}</td>
<td>{{ result.last_name }}</td>
<td>{{ result.gender }}</td>
<td>{{ result.email }}</td>
<td>{{ result.ip_address}}</td>
</tr>
{% empty %}
<tr>No results found.</tr>
{% endfor %}
</tbody>
<table>
</div>
{% endif %}
</form>
{% endblock content %}
{% block extrajs %}
<script>
$(document).ready(function() {
$('#result_table').DataTable({
"searching": false
});
} );
</script>
{% endblock %}
This is basic template for searching. I added DataTable for better appearance.
Before we can search let’s rebuild the indexes by:
$ ./manage.py rebuild_index
Now try to search something in elasticsearch itself to see if the documents are there:
$ curl -XGET http://localhost:9200/haystack/_search?pretty=true&q=first_name:Scott
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1000,
"max_score" : 1.0,
"hits" : [ {
"_index" : "haystack",
"_type" : "modelresult",
"_id" : "persons.person.1",
"_score" : 1.0,
"_source":{"django_ct": "persons.person", "last_name": "Harrison", "ip_address": "38.84.45.160", "email": "rharrison0@linkedin.com", "first_name": "Russell", "gender": "Male", "text": "\nRussell\nHarrison\nMale\nrharrison0@linkedin.com\n38.84.45.160\n", "id": "persons.person.1", "django_id": "1"}
},
# rest of results here...
}
And that’s all. We got working search! You can find the repo on github. If you feel that this post was valuable please send me email. Thanks!