Sample Dataset

In the previous steps you used only a few movies, let's now import:

More movies to discover more queries.
Theaters to discover the geospatial capabilities.
Users to do some aggregations.

Dataset Description

Movies

The file sample-app/redisearch-docker/dataset/import_movies.redis is a script that creates 922 Hashes.

The movie hashes contain the following fields.

movie:id : The unique ID of the movie, internal to this database (used as the key of the hash)
title : The title of the movie.
plot : A summary of the movie.
genre : The genre of the movie, for now a movie will only have a single genre.
release_year : The year the movie was released as a numerical value.
rating : A numeric value representing the public's rating for this movie.
votes : Number of votes.
poster : Link to the movie poster.
imdb_id : id of the movie in the IMDB database.

Sample Data: movie:343

Field	Value
title	Spider-Man
plot	When bitten by a genetically modified spider a nerdy shy and awkward high school student gains spider-like abilities that he eventually must use to fight evil as a superhero after tragedy befalls his family.
genre	Action
release_year	2002
rating	7.3
votes	662219
poster	https://m.media-amazon.com/images/M/MV5BZDEyN2NhMjgtMjdhNi00MmNlLWE5YTgtZGE4MzNjMTRlMGEwXkEyXkFqcGdeQXVyNDUyOTg3Njg@._V1_SX300.jpg
imdb_id	tt0145487

Theaters

The file sample-app/redisearch-docker/dataset/import_theaters.redis is a script that creates 117 Hashes (used for Geospatial queries). This dataset is a list of New York Theaters, and not movie theaters, but it is not that critical for this project ;).

The theater hashes contain the following fields.

theater:id : The unique ID of the theater, internal to this database (used as the key of the hash)
name : The name of the theater
address : The street address
city : The city, in this sample dataset all the theaters are in New York
zip : The zip code
phone : The phone number
url : The URL of the theater
location : Contains the longitude,latitude used to create the Geo-indexed field

Sample Data: theater:20

Field	Value
name	Broadway Theatre
address	1681 Broadway
city	New York
zip	10019
phone	212 944-3700
url	http://www.shubertorganization.com/theatres/broadway.asp
location	-73.98335054631019,40.763270202723625

Users

The file sample-app/redisearch-docker/dataset/import_users.redis is a script that creates 5996 Hashes.

The user hashes contain the following fields.

user:id : The unique ID of the user.
first_name : The first name of the user.
last_name : The last name of the user.
email : The email of the user.
gender : The gender of the user (female/male).
country : The country name of the user.
country_code : The country code of the user.
city : The city of the user.
longitude : The longitude of the user.
latitude : The latitude of the user.
last_login : The last login time for the user, as EPOC time.
ip_address : The IP address of the user.

Sample Data: user:3233

Field	Value
first_name	Rosetta
last_name	Olyff
email	rolyff6g@163.com
gender	female
country	China
country_code	CN
city	Huangdao
longitude	120.04619
latitude	35.872664
last_login	1570386621
ip_address	218.47.90.79

Importing the Movies, Theaters and Users

Before importing the data, flush the database:

> FLUSHALL

The easiest way to import the file is to use the redis-cli, using the following terminal command:

$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_movies.redis

$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_theaters.redis


$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_users.redis

Using Redis Insight or the redis-cli you can look at the dataset:

> HMGET "movie:343" title release_year genre

1) "Spider-Man"
2) "2002"
3) "Action"


>  HMGET "theater:20" name location
1) "Broadway Theatre"
2) "-73.98335054631019,40.763270202723625"



> HMGET "user:343" first_name last_name last_login
1) "Umeko"
2) "Castagno"
3) "1574769122"

You can also use the DBSIZE command to see how many keys you have in your database.

Create Indexes

Create the idx:movie index:

> FT.CREATE idx:movie ON hash PREFIX 1 "movie:" SCHEMA title TEXT SORTABLE plot TEXT WEIGHT 0.5 release_year NUMERIC SORTABLE rating NUMERIC SORTABLE votes NUMERIC SORTABLE genre TAG SORTABLE

"OK"

The movies have now been indexed, you can run the FT.INFO "idx:movie" command and look at the num_docs returned value. (should be 922).

Create the idx:theater index:

This index will mostly be used to show the geospatial capabilties of RediSearch.

In the previous examples we have created indexes with 3 types:

Text
Numeric
Tag

You will now discover a new type of field: Geo.

The theater hashes contains a field location with the longitude and latitude, that will be used in the index as follows:

> FT.CREATE idx:theater ON hash PREFIX 1 "theater:" SCHEMA name TEXT SORTABLE location GEO

"OK"

The theaters have been indexed, you can run the FT.INFO "idx:theater" command and look at the num_docs returned value. (should be 117).

Create the idx:user index:

> FT.CREATE idx:user ON hash PREFIX 1 "user:" SCHEMA gender TAG country TAG SORTABLE last_login NUMERIC SORTABLE location GEO

"OK"

Next: Querying the movie database

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample Dataset

Dataset Description

Importing the Movies, Theaters and Users

Create Indexes

FilesExpand file tree

006-import-dataset.md

Latest commit

History

006-import-dataset.md

File metadata and controls

Sample Dataset

Dataset Description

Importing the Movies, Theaters and Users

Create Indexes