In the previous steps you used only a few movies, let's now import:
- More movies to discover more queries.
- Theaters to discover the geospatial capabilities.
- Users to do some aggregations.
Movies
The file sample-app/redisearch-docker/dataset/import_movies.redis is a script that creates 922 Hashes.
The movie hashes contain the following fields.
movie:id: The unique ID of the movie, internal to this database (used as the key of the hash)title: The title of the movie.plot: A summary of the movie.genre: The genre of the movie, for now a movie will only have a single genre.release_year: The year the movie was released as a numerical value.rating: A numeric value representing the public's rating for this movie.votes: Number of votes.poster: Link to the movie poster.imdb_id: id of the movie in the IMDB database.
Sample Data: movie:343
| Field | Value |
|---|---|
| title | Spider-Man |
| plot | When bitten by a genetically modified spider a nerdy shy and awkward high school student gains spider-like abilities that he eventually must use to fight evil as a superhero after tragedy befalls his family. |
| genre | Action |
| release_year | 2002 |
| rating | 7.3 |
| votes | 662219 |
| poster | https://m.media-amazon.com/images/M/MV5BZDEyN2NhMjgtMjdhNi00MmNlLWE5YTgtZGE4MzNjMTRlMGEwXkEyXkFqcGdeQXVyNDUyOTg3Njg@._V1_SX300.jpg |
| imdb_id | tt0145487 |
Theaters
The file sample-app/redisearch-docker/dataset/import_theaters.redis is a script that creates 117 Hashes (used for Geospatial queries). This dataset is a list of New York Theaters, and not movie theaters, but it is not that critical for this project ;).
The theater hashes contain the following fields.
theater:id: The unique ID of the theater, internal to this database (used as the key of the hash)name: The name of the theateraddress: The street addresscity: The city, in this sample dataset all the theaters are in New Yorkzip: The zip codephone: The phone numberurl: The URL of the theaterlocation: Contains thelongitude,latitudeused to create the Geo-indexed field
Sample Data: theater:20
| Field | Value |
|---|---|
| name | Broadway Theatre |
| address | 1681 Broadway |
| city | New York |
| zip | 10019 |
| phone | 212 944-3700 |
| url | http://www.shubertorganization.com/theatres/broadway.asp |
| location | -73.98335054631019,40.763270202723625 |
Users
The file sample-app/redisearch-docker/dataset/import_users.redis is a script that creates 5996 Hashes.
The user hashes contain the following fields.
user:id: The unique ID of the user.first_name: The first name of the user.last_name: The last name of the user.email: The email of the user.gender: The gender of the user (female/male).country: The country name of the user.country_code: The country code of the user.city: The city of the user.longitude: The longitude of the user.latitude: The latitude of the user.last_login: The last login time for the user, as EPOC time.ip_address: The IP address of the user.
Sample Data: user:3233
| Field | Value |
|---|---|
| first_name | Rosetta |
| last_name | Olyff |
| rolyff6g@163.com | |
| gender | female |
| country | China |
| country_code | CN |
| city | Huangdao |
| longitude | 120.04619 |
| latitude | 35.872664 |
| last_login | 1570386621 |
| ip_address | 218.47.90.79 |
Before importing the data, flush the database:
> FLUSHALL
The easiest way to import the file is to use the redis-cli, using the following terminal command:
$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_movies.redis
$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_theaters.redis
$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_users.redis
Using Redis Insight or the redis-cli you can look at the dataset:
> HMGET "movie:343" title release_year genre
1) "Spider-Man"
2) "2002"
3) "Action"
> HMGET "theater:20" name location
1) "Broadway Theatre"
2) "-73.98335054631019,40.763270202723625"
> HMGET "user:343" first_name last_name last_login
1) "Umeko"
2) "Castagno"
3) "1574769122"
You can also use the DBSIZE command to see how many keys you have in your database.
Create the idx:movie index:
> FT.CREATE idx:movie ON hash PREFIX 1 "movie:" SCHEMA title TEXT SORTABLE plot TEXT WEIGHT 0.5 release_year NUMERIC SORTABLE rating NUMERIC SORTABLE votes NUMERIC SORTABLE genre TAG SORTABLE
"OK"
The movies have now been indexed, you can run the FT.INFO "idx:movie" command and look at the num_docs returned value. (should be 922).
Create the idx:theater index:
This index will mostly be used to show the geospatial capabilties of RediSearch.
In the previous examples we have created indexes with 3 types:
TextNumericTag
You will now discover a new type of field: Geo.
The theater hashes contains a field location with the longitude and latitude, that will be used in the index as follows:
> FT.CREATE idx:theater ON hash PREFIX 1 "theater:" SCHEMA name TEXT SORTABLE location GEO
"OK"
The theaters have been indexed, you can run the FT.INFO "idx:theater" command and look at the num_docs returned value. (should be 117).
Create the idx:user index:
> FT.CREATE idx:user ON hash PREFIX 1 "user:" SCHEMA gender TAG country TAG SORTABLE last_login NUMERIC SORTABLE location GEO
"OK"