Comprehensive troubleshooting guide for OpenSearch-related issues in the Multi-Modal Academic Research System.
- Connection Issues
- Indexing Failures
- Search Performance Problems
- Memory Issues
- Cluster Health
- Index Corruption
- Mapping Conflicts
- Diagnostic Commands
Symptoms:
ConnectionError: Connection refusedConnectionTimeoutTransportError(N/A, 'Unable to connect')
Diagnosis:
# Check if OpenSearch is running
curl http://localhost:9200
# Check Docker containers
docker ps | grep opensearch
# Check port availability
lsof -i :9200
netstat -an | grep 9200Solutions:
- Start OpenSearch if not running:
docker run -d \
--name opensearch-node \
-p 9200:9200 -p 9600:9600 \
-e "discovery.type=single-node" \
-e "DISABLE_SECURITY_PLUGIN=true" \
opensearchproject/opensearch:latest- Check Docker daemon:
# Ensure Docker is running
docker info
# Restart Docker if needed
# Mac: Restart Docker Desktop
# Linux: sudo systemctl restart docker- Verify network configuration:
# Test connectivity
telnet localhost 9200
# Check firewall rules
sudo iptables -L | grep 9200 # Linux
# or
sudo pfctl -s rules | grep 9200 # Mac- Fix host/port mismatch:
# In opensearch_manager.py
client = OpenSearch(
hosts=[{'host': 'localhost', 'port': 9200}],
http_compress=True,
use_ssl=False,
verify_certs=False,
timeout=30
)Symptoms:
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED]ConnectionError: Caused by SSLError
Solutions:
- Disable SSL for local development:
docker run -d \
--name opensearch-node \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "DISABLE_SECURITY_PLUGIN=true" \
-e "plugins.security.ssl.http.enabled=false" \
opensearchproject/opensearch:latest- Update client configuration:
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=[{'host': 'localhost', 'port': 9200}],
http_compress=True,
use_ssl=False,
verify_certs=False,
ssl_show_warn=False
)Symptoms:
AuthenticationException403 Forbidden401 Unauthorized
Solutions:
- Disable security for local development:
docker run -d \
--name opensearch-node \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "DISABLE_SECURITY_PLUGIN=true" \
opensearchproject/opensearch:latest- Use basic authentication (if security enabled):
client = OpenSearch(
hosts=[{'host': 'localhost', 'port': 9200}],
http_auth=('admin', 'admin'), # Default credentials
use_ssl=True,
verify_certs=False
)Symptoms:
RequestError: resource_already_exists_exceptionTransportError(400, 'illegal_argument_exception')
Diagnosis:
# Check if index exists
curl http://localhost:9200/_cat/indices?v
# Get index details
curl http://localhost:9200/research_assistantSolutions:
- Delete existing index (WARNING: loses data):
curl -X DELETE http://localhost:9200/research_assistant- Handle in code:
from opensearchpy.exceptions import RequestError
try:
client.indices.create(index='research_assistant', body=settings)
except RequestError as e:
if e.error == 'resource_already_exists_exception':
print(f"Index already exists")
else:
raise- Use index templates for reusability:
index_template = {
"index_patterns": ["research_*"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": 384
}
}
}
}
}
client.indices.put_index_template(
name='research_template',
body=index_template
)Symptoms:
BulkIndexError- Documents not appearing in index
- Partial indexing success
Diagnosis:
# Check bulk response
response = client.bulk(body=bulk_data)
if response['errors']:
for item in response['items']:
if 'error' in item.get('index', {}):
print(f"Error: {item['index']['error']}")Solutions:
- Handle errors gracefully:
from opensearchpy import helpers
def index_documents(documents):
actions = [
{
"_index": "research_assistant",
"_id": doc['id'],
"_source": doc
}
for doc in documents
]
success, failed = helpers.bulk(
client,
actions,
raise_on_error=False,
raise_on_exception=False
)
print(f"Indexed: {success}, Failed: {len(failed)}")
for item in failed:
print(f"Failed document: {item}")- Reduce batch size:
def index_in_batches(documents, batch_size=100):
for i in range(0, len(documents), batch_size):
batch = documents[i:i+batch_size]
helpers.bulk(client, prepare_actions(batch))
time.sleep(1) # Rate limiting- Validate data before indexing:
def validate_document(doc):
required_fields = ['id', 'title', 'content']
for field in required_fields:
if field not in doc:
raise ValueError(f"Missing field: {field}")
# Validate embedding dimension
if 'embedding' in doc:
if len(doc['embedding']) != 384:
raise ValueError(f"Invalid embedding dimension")
return TrueSymptoms:
mapper_parsing_exceptionillegal_argument_exception: mapper [field] cannot be changed
Diagnosis:
# Get current mapping
curl http://localhost:9200/research_assistant/_mapping?pretty
# Check field types
curl http://localhost:9200/research_assistant/_mapping/field/embeddingSolutions:
- Reindex with correct mapping:
# Create new index with correct mapping
new_index_settings = {
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": 384
},
"publication_date": {
"type": "date",
"format": "yyyy-MM-dd||epoch_millis"
}
}
}
}
client.indices.create(index='research_assistant_v2', body=new_index_settings)
# Reindex data
client.reindex(
body={
"source": {"index": "research_assistant"},
"dest": {"index": "research_assistant_v2"}
}
)
# Switch alias
client.indices.delete_alias(index='research_assistant', name='research')
client.indices.put_alias(index='research_assistant_v2', name='research')- Use dynamic mapping carefully:
index_settings = {
"settings": {
"index": {
"mapping": {
"ignore_malformed": True, # Ignore badly formatted data
"coerce": True # Try to convert types
}
}
}
}Symptoms:
- Queries taking > 5 seconds
- Timeout errors
- High CPU usage
Diagnosis:
# Check query performance
curl -X GET "localhost:9200/research_assistant/_search?pretty" \
-H 'Content-Type: application/json' \
-d '{"profile": true, "query": {"match_all": {}}}'
# Check cluster stats
curl http://localhost:9200/_cluster/stats?pretty
# Monitor slow queries
curl http://localhost:9200/_nodes/stats/indices/search?prettySolutions:
- Optimize query structure:
# Use filters instead of queries when possible (cached)
query = {
"query": {
"bool": {
"must": [
{"match": {"content": query_text}}
],
"filter": [ # Filters are faster and cached
{"term": {"content_type": "paper"}},
{"range": {"publication_date": {"gte": "2020-01-01"}}}
]
}
}
}- Add result size limits:
results = client.search(
index='research_assistant',
body=query,
size=10, # Limit results
request_timeout=30
)- Use pagination for large result sets:
from opensearchpy import helpers
def search_with_pagination(query, page_size=100):
results = helpers.scan(
client,
index='research_assistant',
query=query,
size=page_size,
scroll='2m'
)
for hit in results:
yield hit- Optimize kNN search:
# Reduce candidates for kNN
knn_query = {
"size": 10,
"query": {
"knn": {
"embedding": {
"vector": query_embedding,
"k": 10, # Number of nearest neighbors
"method_parameters": {
"ef_search": 100 # Reduce for faster search
}
}
}
}
}- Enable request caching:
# Update index settings
curl -X PUT "localhost:9200/research_assistant/_settings" \
-H 'Content-Type: application/json' \
-d '{
"index.requests.cache.enable": true,
"index.queries.cache.enabled": true
}'Symptoms:
- Slow search and indexing
- High disk usage in
docker stats
Solutions:
- Increase refresh interval:
# Reduce index refresh frequency
client.indices.put_settings(
index='research_assistant',
body={
"index": {
"refresh_interval": "30s" # Default is 1s
}
}
)
# Disable during bulk indexing
client.indices.put_settings(
index='research_assistant',
body={"index": {"refresh_interval": "-1"}}
)
# Re-enable after indexing
client.indices.put_settings(
index='research_assistant',
body={"index": {"refresh_interval": "1s"}}
)- Force merge segments:
# Reduce number of segments
curl -X POST "localhost:9200/research_assistant/_forcemerge?max_num_segments=1"Symptoms:
OutOfMemoryError: Java heap space- Container crashes
- Degraded performance
Diagnosis:
# Check memory usage
docker stats opensearch-node
# Check JVM heap
curl http://localhost:9200/_nodes/stats/jvm?prettySolutions:
- Increase heap size:
docker run -d \
--name opensearch-node \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "OPENSEARCH_JAVA_OPTS=-Xms2g -Xmx2g" \
--memory=4g \
opensearchproject/opensearch:latest- Reduce field data cache:
curl -X PUT "localhost:9200/_cluster/settings" \
-H 'Content-Type: application/json' \
-d '{
"persistent": {
"indices.fielddata.cache.size": "20%"
}
}'- Clear caches:
# Clear field data cache
curl -X POST "localhost:9200/_cache/clear?fielddata=true"
# Clear query cache
curl -X POST "localhost:9200/_cache/clear?query=true"
# Clear all caches
curl -X POST "localhost:9200/_cache/clear"- Optimize index settings:
index_settings = {
"settings": {
"number_of_shards": 1, # Reduce for single-node
"number_of_replicas": 0, # No replicas for local dev
"codec": "best_compression" # Trade CPU for memory
}
}Symptoms:
- Yellow cluster status
- Red cluster status
- Unassigned shards
Diagnosis:
# Check cluster health
curl http://localhost:9200/_cluster/health?pretty
# Check shard allocation
curl http://localhost:9200/_cat/shards?v
# Get allocation explanation
curl http://localhost:9200/_cluster/allocation/explain?prettySolutions:
- Yellow status (unassigned replicas):
# Normal for single-node cluster
# Set replicas to 0
curl -X PUT "localhost:9200/research_assistant/_settings" \
-H 'Content-Type: application/json' \
-d '{"index": {"number_of_replicas": 0}}'- Red status (missing primary shards):
# Try to reroute shards
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true"
# If that fails, may need to restore from snapshot- Enable shard allocation:
curl -X PUT "localhost:9200/_cluster/settings" \
-H 'Content-Type: application/json' \
-d '{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}'Symptoms:
CorruptIndexException- Missing or corrupted data
- Search returns incomplete results
Diagnosis:
# Check index health
curl http://localhost:9200/_cat/indices/research_assistant?v
# Verify shard status
curl http://localhost:9200/_cat/shards/research_assistant?vSolutions:
- Close and reopen index:
curl -X POST "localhost:9200/research_assistant/_close"
curl -X POST "localhost:9200/research_assistant/_open"- Try to recover:
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true"- Recreate index from source data:
# Delete corrupted index
client.indices.delete(index='research_assistant')
# Recreate with proper settings
client.indices.create(index='research_assistant', body=index_settings)
# Reindex all documents
# Run your data collection and indexing pipeline again# Cluster health overview
curl http://localhost:9200/_cluster/health?pretty
# Node information
curl http://localhost:9200/_nodes/stats?pretty
# Index statistics
curl http://localhost:9200/_cat/indices?v
# Shard allocation
curl http://localhost:9200/_cat/shards?v
# Pending tasks
curl http://localhost:9200/_cat/pending_tasks?v
# Thread pool stats
curl http://localhost:9200/_cat/thread_pool?v
# Hot threads
curl http://localhost:9200/_nodes/hot_threads# Search performance
curl http://localhost:9200/_nodes/stats/indices/search?pretty
# Indexing performance
curl http://localhost:9200/_nodes/stats/indices/indexing?pretty
# Cache statistics
curl http://localhost:9200/_nodes/stats/indices/query_cache,fielddata,request_cache?pretty
# JVM memory
curl http://localhost:9200/_nodes/stats/jvm?pretty
# Disk usage
curl http://localhost:9200/_nodes/stats/fs?pretty# Index settings
curl http://localhost:9200/research_assistant/_settings?pretty
# Index mappings
curl http://localhost:9200/research_assistant/_mapping?pretty
# Index stats
curl http://localhost:9200/research_assistant/_stats?pretty
# Document count
curl http://localhost:9200/research_assistant/_count
# Sample documents
curl -X GET "localhost:9200/research_assistant/_search?size=1&pretty"- Monitor cluster health:
# Set up monitoring script
#!/bin/bash
while true; do
curl -s http://localhost:9200/_cluster/health | jq .status
sleep 60
done- Clear old logs:
docker logs opensearch-node --tail 1000 > opensearch.log
docker exec opensearch-node find /usr/share/opensearch/logs -mtime +7 -delete- Optimize indices regularly:
# Weekly optimization
curl -X POST "localhost:9200/research_assistant/_forcemerge?max_num_segments=1"- Backup important indices:
# Create snapshot repository
curl -X PUT "localhost:9200/_snapshot/backup" \
-H 'Content-Type: application/json' \
-d '{
"type": "fs",
"settings": {
"location": "/usr/share/opensearch/backup"
}
}'
# Create snapshot
curl -X PUT "localhost:9200/_snapshot/backup/snapshot_1?wait_for_completion=true"For OpenSearch-specific issues:
- Check OpenSearch logs:
docker logs opensearch-node - Enable debug logging: Set
OPENSEARCH_JAVA_OPTSwith-Xlog:gc* - Visit OpenSearch forums: https://forum.opensearch.org/
- Check GitHub issues: https://github.com/opensearch-project/OpenSearch