Skip to content

Commit 29f3ae3

Browse files
committed
Enhance Python examples and dataset handling
- Updated README.md for examples to include NULL handling and improved data type descriptions. - Added download_sample_data.py to automate downloading and modifying the MovieLens dataset, introducing NULL values for testing. - Updated pyproject.toml to change the Bug Tracker link to the new repository.
1 parent e7c3df9 commit 29f3ae3

17 files changed

Lines changed: 2383 additions & 119 deletions

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ pip install arcadedb-embedded # With Gremlin + GraphQL
2424

2525
**Requirements:** Python 3.8+ and [Java 21+](https://adoptium.net/) (JRE)
2626

27+
**💡 Tip:** See "JVMCI is not enabled" warnings? Install [GraalVM](https://humemai.github.io/arcadedb-embedded-python/latest/getting-started/installation/#eliminate-polyglot-warnings-optional) to fix them
28+
2729
**Technology:** Uses [JPype](https://jpype.readthedocs.io/) to bridge Python and Java, providing direct access to ArcadeDB's embedded engine with minimal overhead.
2830

2931
### Basic Usage (CRUD)

bindings/python/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,9 @@ pip install arcadedb-embedded
3939

4040
**Requirements**: Java 21+ must be installed ([details](https://humemai.github.io/arcadedb-embedded-python/latest/getting-started/installation/#java-runtime-environment-jre))
4141

42+
!!! tip "Eliminate JVMCI Warnings"
43+
See warnings about "JVMCI is not enabled"? Install [GraalVM](https://humemai.github.io/arcadedb-embedded-python/latest/getting-started/installation/#eliminate-polyglot-warnings-optional) to fix them.
44+
4245
### 5-Minute Example
4346

4447
```python

bindings/python/docs/development/troubleshooting.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -824,12 +824,12 @@ else:
824824
- [Examples](../examples/import.md)
825825

826826
2. **Search Issues:**
827-
- [GitHub Issues](https://github.com/ArcadeData/arcadedb/issues)
827+
- [GitHub Issues](https://github.com/humemai/arcadedb-embedded-python/issues)
828828
- [ArcadeDB Documentation](https://docs.arcadedb.com/)
829829

830830
3. **Ask Community:**
831831
- [Discord](https://discord.gg/arcadedb)
832-
- [GitHub Discussions](https://github.com/ArcadeData/arcadedb/discussions)
832+
- [GitHub Discussions](https://github.com/humemai/arcadedb-embedded-python/discussions)
833833

834834
4. **Report Bug:**
835835
Include:
Lines changed: 85 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
11
# Simple Document Store Example
22

3-
This comprehensive example demonstrates ArcadeDB's document capabilities using real-world scenarios. You'll learn about data types, SQL functions, and the differences between document and graph storage models.
3+
This comprehensive example demonstrates ArcadeDB's document capabilities using a task management system. You'll learn about data types, NULL handling, SQL operations, and the differences between document and graph storage models.
44

55
## Overview
66

7-
The example creates a product catalog system showcasing:
7+
The example creates a task management system showcasing:
88

9-
- **Rich Data Types** - STRING, BOOLEAN, INTEGER, FLOAT, DECIMAL, DATE, DATETIME, BINARY, EMBEDDED, LINK, Arrays
10-
- **SQL Functions** - uuid(), date(), sysdate() for dynamic data generation
9+
- **Rich Data Types** - STRING, BOOLEAN, INTEGER, FLOAT, DECIMAL, DATE, DATETIME, LIST OF STRING, and Arrays
10+
- **NULL Handling** - INSERT with NULL, UPDATE to NULL, queries with IS NULL/IS NOT NULL
11+
- **SQL Operations** - Complete CRUD workflow with ArcadeDB SQL
12+
- **Built-in Functions** - date() for date literals, uuid() for unique IDs, sysdate() for dynamic timestamps
1113
- **Record Types** - Understanding Documents vs Vertices vs Edges
12-
- **Schema Evolution** - Adding typed properties for performance and validation
13-
- **CRUD Operations** - Complete create, read, update, delete workflow
14+
- **Schema Flexibility** - Typed properties for performance with schema-optional flexibility
15+
- **Type Safety** - LIST OF STRING for validated array data
1416

1517
## Source Code
1618

@@ -20,37 +22,64 @@ The complete example is available at: [`examples/01_simple_document_store.py`](.
2022

2123
### 1. Data Type Support
2224

23-
ArcadeDB provides comprehensive data type support:
25+
ArcadeDB provides comprehensive data type support with NULL handling:
2426

2527
```python
26-
# Schema with typed properties for performance
28+
# Schema with typed properties for performance and validation
29+
with db.transaction():
30+
db.command("sql", "CREATE DOCUMENT TYPE Task")
31+
db.command("sql", "CREATE PROPERTY Task.title STRING")
32+
db.command("sql", "CREATE PROPERTY Task.priority STRING")
33+
db.command("sql", "CREATE PROPERTY Task.completed BOOLEAN")
34+
db.command("sql", "CREATE PROPERTY Task.tags LIST OF STRING") # Type-safe arrays
35+
db.command("sql", "CREATE PROPERTY Task.created_date DATE")
36+
db.command("sql", "CREATE PROPERTY Task.due_datetime DATETIME")
37+
db.command("sql", "CREATE PROPERTY Task.estimated_hours FLOAT")
38+
db.command("sql", "CREATE PROPERTY Task.priority_score INTEGER")
39+
db.command("sql", "CREATE PROPERTY Task.cost DECIMAL")
40+
db.command("sql", "CREATE PROPERTY Task.task_id STRING")
41+
42+
# Insert with NULL values for optional fields and uuid() for unique ID
2743
db.command("sql", """
28-
CREATE DOCUMENT TYPE Product (
29-
name STRING,
30-
description STRING,
31-
price DECIMAL,
32-
in_stock BOOLEAN,
33-
category_id INTEGER,
34-
weight FLOAT,
35-
created_at DATETIME,
36-
tags STRING[],
37-
metadata EMBEDDED
38-
)
44+
INSERT INTO Task SET
45+
title = 'Write documentation',
46+
priority = 'medium',
47+
completed = false,
48+
tags = ['work', 'writing'],
49+
created_date = date('2024-01-16'),
50+
due_datetime = NULL,
51+
estimated_hours = 8.0,
52+
priority_score = 70,
53+
cost = NULL,
54+
task_id = uuid()
3955
""")
4056
```
4157

42-
### 2. SQL Functions
58+
### 2. SQL Functions and NULL Queries
4359

44-
Learn about built-in functions for dynamic data:
60+
Learn about built-in functions and NULL handling:
4561

4662
```python
47-
# UUID generation and date functions
48-
result = db.command("sql", """
49-
INSERT INTO Product SET
50-
id = uuid(),
51-
name = 'Laptop Pro',
52-
created_at = sysdate(),
53-
launch_date = date('2024-01-15', 'yyyy-MM-dd')
63+
# Built-in functions: date() for DATE type, uuid() for unique IDs
64+
db.command("sql", """
65+
INSERT INTO Task SET
66+
title = 'Buy groceries',
67+
task_id = uuid(),
68+
created_date = date('2024-01-15'),
69+
due_datetime = '2024-01-20 18:00:00',
70+
cost = 150.00
71+
""")
72+
73+
# Query for NULL values
74+
result = db.query("sql", "SELECT FROM Task WHERE due_datetime IS NULL")
75+
result = db.query("sql", "SELECT FROM Task WHERE cost IS NULL")
76+
77+
# UPDATE to set NULL (clear optional values)
78+
db.command("sql", """
79+
UPDATE Task SET
80+
cost = NULL,
81+
estimated_hours = NULL
82+
WHERE title = 'Call dentist'
5483
""")
5584
```
5685

@@ -66,10 +95,12 @@ Understanding when to use different record types:
6695

6796
The example demonstrates:
6897

69-
- **Embedded Documents** - Nested JSON-like structures
70-
- **Arrays** - Collections of values
71-
- **Schema Evolution** - Adding properties dynamically
72-
- **Query Optimization** - Using indexes and typed properties
98+
- **NULL Values** - Optional fields with IS NULL/IS NOT NULL queries
99+
- **Type-Safe Arrays** - LIST OF STRING for validated collections
100+
- **DECIMAL Handling** - Java BigDecimal conversion via float(str(value))
101+
- **DATETIME Literals** - String literals automatically parsed to DATETIME type
102+
- **Schema-Optional Flexibility** - Define properties for performance, add ad-hoc fields when needed
103+
- **Query Optimization** - Using typed properties and indexes
73104

74105
## Running the Example
75106

@@ -79,21 +110,24 @@ python 01_simple_document_store.py
79110
```
80111

81112
Expected output includes:
113+
82114
- Database creation and schema setup
83-
- Sample data insertion with various types
84-
- Query demonstrations
115+
- Sample tasks with various data types and NULL values
116+
- Query demonstrations including NULL checks
117+
- UPDATE operations setting values to NULL
85118
- File structure explanation
86119

87120
## Database Structure
88121

89122
After running, examine the created files:
90123

91-
```
92-
databases/product_catalog/
124+
```text
125+
my_test_databases/task_db/
93126
├── configuration.json # Database configuration
94-
├── schema.json # Type definitions and indexes
95-
├── *.bucket # Data storage files
96-
└── statistics.json # Database statistics
127+
├── schema.json # Type definitions with LIST OF STRING
128+
├── Task_*.bucket # Data storage files with tasks
129+
├── dictionary.*.dict # String compression dictionary
130+
└── statistics.json # Database statistics
97131
```
98132

99133
## Next Steps
@@ -106,15 +140,24 @@ After mastering this example:
106140

107141
## Common Questions
108142

143+
**Q: How does ArcadeDB handle NULL values?**
144+
A: All ArcadeDB types support NULL by default. You can INSERT NULL, UPDATE to NULL, and query with IS NULL/IS NOT NULL operators.
145+
146+
**Q: What's the difference between LIST and LIST OF STRING?**
147+
A: LIST is a generic untyped list. LIST OF STRING provides type validation ensuring all elements are strings, giving better performance and data integrity.
148+
109149
**Q: Why use typed properties?**
110-
A: They provide better performance, validation, and enable advanced features like indexes.
150+
A: They provide better performance, validation, and enable advanced features like indexes. But ArcadeDB is schema-optional - you can still add properties dynamically.
111151

112152
**Q: When should I use Documents vs Vertices?**
113-
A: Use Documents for simple data storage (like SQL tables). Use Vertices when you need to model relationships between entities.
153+
A: Use Documents for simple data storage (like SQL tables). Use Vertices when you need to model relationships between entities with Edges.
154+
155+
**Q: How do I handle Java BigDecimal in Python?**
156+
A: Convert via string first: `float(str(decimal_value))`. Direct conversion from Java BigDecimal to Python float requires string intermediary.
114157

115158
**Q: Can I mix data types?**
116159
A: Yes! ArcadeDB is schema-flexible. You can add properties dynamically while benefiting from typed properties where defined.
117160

118161
---
119162

120-
*Need help? Check our [troubleshooting guide](../troubleshooting.md) or [open an issue](https://github.com/ArcadeData/arcadedb/issues).*
163+
*Need help? Check our [troubleshooting guide](../troubleshooting.md) or [open an issue](https://github.com/humemai/arcadedb-embedded-python/issues).*

bindings/python/docs/examples/02_social_network_graph.md

Lines changed: 56 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,12 @@ This example demonstrates how to use ArcadeDB as a graph database to model and q
1111
**What You'll Learn:**
1212
- Creating vertex and edge types (schema definition)
1313
- Modeling entities (Person) and relationships (FRIEND_OF) with properties
14+
- NULL value handling for optional vertex properties (email, phone, reputation)
1415
- Graph traversal patterns (friends, friends-of-friends, mutual connections)
15-
- Comparing SQL vs Cypher query languages for graph operations
16+
- Comparing SQL MATCH vs Cypher query languages for graph operations
17+
- Variable-length path queries (`*1..3` syntax in Cypher)
1618
- Working with relationship properties and bidirectional edges
19+
- Filtering by NULL values (finding people without email/phone)
1720
- Proper transaction handling and property access patterns
1821
- Real-world graph database implementation techniques
1922

@@ -66,6 +69,11 @@ with db.transaction():
6669
db.command("sql", "CREATE PROPERTY Person.name STRING")
6770
db.command("sql", "CREATE PROPERTY Person.age INTEGER")
6871
db.command("sql", "CREATE PROPERTY Person.city STRING")
72+
db.command("sql", "CREATE PROPERTY Person.joined_date DATE")
73+
db.command("sql", "CREATE PROPERTY Person.email STRING") # Optional (can be NULL)
74+
db.command("sql", "CREATE PROPERTY Person.phone STRING") # Optional (can be NULL)
75+
db.command("sql", "CREATE PROPERTY Person.verified BOOLEAN")
76+
db.command("sql", "CREATE PROPERTY Person.reputation FLOAT") # Optional (can be NULL)
6977

7078
# Create edge type with properties
7179
db.command("sql", "CREATE EDGE TYPE FRIEND_OF")
@@ -133,6 +141,53 @@ for row in result:
133141
print(f"Friend: {name} from {city}")
134142
```
135143

144+
## NULL Value Handling in Graphs
145+
146+
Graph vertices can have optional properties with NULL values:
147+
148+
```python
149+
# Insert person with NULL email and phone
150+
with db.transaction():
151+
db.command("sql", """
152+
CREATE VERTEX Person SET
153+
name = 'Eve Brown',
154+
age = 29,
155+
city = 'Seattle',
156+
joined_date = date('2020-08-22'),
157+
email = NULL,
158+
phone = NULL,
159+
verified = false,
160+
reputation = 3.2
161+
""")
162+
```
163+
164+
### Querying for NULL Values
165+
166+
Find vertices with missing optional properties:
167+
168+
```python
169+
# Find people without email addresses
170+
result = db.query("sql", """
171+
SELECT name, phone, verified
172+
FROM Person
173+
WHERE email IS NULL
174+
""")
175+
176+
# Find verified people with reputation scores (exclude NULLs)
177+
result = db.query("sql", """
178+
SELECT name, reputation, city
179+
FROM Person
180+
WHERE verified = true AND reputation IS NOT NULL
181+
ORDER BY reputation DESC
182+
""")
183+
```
184+
185+
This pattern is useful for:
186+
- Finding incomplete profiles
187+
- Identifying missing contact information
188+
- Filtering by data completeness
189+
- Quality checks and data validation
190+
136191
## Advanced Graph Patterns
137192

138193
### Friends of Friends

bindings/python/docs/examples/03_vector_search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -457,4 +457,4 @@ Until the implementation stabilizes, consider mature alternatives:
457457

458458
---
459459

460-
*This example is educational and demonstrates current capabilities. Monitor [ArcadeDB releases](https://github.com/ArcadeData/arcadedb/releases) for vector search stability updates.*
460+
*This example is educational and demonstrates current capabilities. Monitor [ArcadeDB releases](https://github.com/humemai/arcadedb-embedded-python/releases) for vector search stability updates.*

0 commit comments

Comments
 (0)