Skip to content

Commit efdfba7

Browse files
author
BoilingData
authored
Merge pull request #1 from boilingdata/readme-updates
Update readme
2 parents 0ae8644 + 3186cea commit efdfba7

1 file changed

Lines changed: 78 additions & 2 deletions

File tree

README.md

Lines changed: 78 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,19 @@
33
![CI](https://github.com/boilingdata/node-boilingdata/workflows/CI/badge.svg?branch=main)
44
![BuiltBy](https://img.shields.io/badge/TypeScript-Lovers-black.svg "img.shields.io")
55

6+
## Installing the SDK
7+
68
```shell
79
yarn add @boilingdata/node-boilingdata
810
```
911

12+
## Basic Example
13+
1014
```typescript
11-
import { BoilingData, isDataResponse } from "../boilingdata/boilingdata";
15+
import { BoilingData, isDataResponse } from "@boilingdata/node-boilingdata";
1216

1317
async function main() {
14-
const bdInstance = new BoilingData({ process.env["BD_USERNAME"], process.env["BD_PASSWORD"] });
18+
const bdInstance = new BoilingData({ username: process.env["BD_USERNAME"], password: process.env["BD_PASSWORD"] });
1519
await bdInstance.connect();
1620
const rows = await new Promise<any[]>((resolve, reject) => {
1721
let r: any[] = [];
@@ -33,3 +37,75 @@ async function main() {
3337
```
3438

3539
This repository contains JS/TS BoilingData SDK. Please see the integration tests on `tests/query.test.ts` for for more examples.
40+
41+
### Callbacks
42+
43+
The SDK uses the BoilingData Websocket API in the background, meaning that events can arrive at any time. We use a range of global and query-specific callbacks to allow you to hook into the events that you care about.
44+
45+
All callbacks work in both the global scope and the query scope; i.e. global callbacks will always be executed when a message arrives, query callbacks will only be executed when messages relating to that query arrive.
46+
47+
- onRequest - This event happens when your application sends a request to BoilingData
48+
- onData - Query data response. A single query may have many onData events as processing is parallelised in the background.
49+
- onQueryFinished - The processing of data has completed, and you should not expect any further onData events (although more info messages may arrive)
50+
- onLambdaEvent - the status of your datasets, i.e. warm, warmingUp, shutdown
51+
- onSocketOpen - executed when the socket API successfully opens (so it is safe to start sending SQL queries)
52+
- onSocketClose - executed when the socket API has closed (intentionally or not)
53+
- onInfo - information about a query - connection time, query time, execution time, etc.
54+
- onLogError - Log Errors, such as SQL syntax errors.
55+
- onLogWarn - Log warning messages
56+
- onLogInfo - Log info messages
57+
- onLogDebug - Log debug messsages
58+
59+
#### Setting Global Callbacks
60+
61+
Global callbacks can be set when creating the BoilingData instance.
62+
```typescript
63+
new BoilingData({
64+
username, password,
65+
globalCallbacks: {
66+
onRequest: req => { console.log("A new request has been made with ID", req.requestId)},
67+
onQueryFinished: req => { console.log("Request complete!", req.requestId)},
68+
onLogError: message => { console.error("LogError",message)},
69+
onSocketOpen: (socketInstance) => {
70+
console.log("The socket has opened!")
71+
},
72+
onLambdaEvent: message => {
73+
console.log("Change in status of dataset: ",message)
74+
}
75+
}
76+
});
77+
```
78+
79+
#### Setting Query-level Callbacks
80+
81+
Query callbacks are set when creating the query
82+
```typescript
83+
bdInstance.execQuery({
84+
sql: `SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/demo2.parquet');`,
85+
callbacks: {
86+
onData: data => {
87+
console.log("Some data for this query arrived",data)
88+
},
89+
onQueryFinished: () => resolve(r),
90+
onLogError: (data: any) => reject(data),
91+
92+
},
93+
});
94+
```
95+
96+
## Using `keys`
97+
98+
BoilingData works best for running the same query against many files (for example, creating a historical trend from a dataset that is partitioned by date). To achieve this, you can use the `keys` array to specify a list of files to query, and the string `s3://KEY` in place of the file location in the SQL query:
99+
100+
```typescript
101+
bdInstance.execQuery(
102+
sql: `SELECT 's3://KEY' as fileLocation, COUNT(*) as rowCount FROM parquet_scan('s3://KEY');`,
103+
keys: [
104+
"s3://bucket/data/2022-01-01.parquet",
105+
"s3://bucket/data/2022-01-02.parquet",
106+
"s3://bucket/data/2022-01-03.parquet",
107+
])
108+
```
109+
Results are streamed as soon as they are ready, so it is unlikely that you will recieve results in the same order that you specified the files.
110+
111+
If you do not need to query multiple files, then you do not need to specify the keys, for instance `SELECT COUNT(*) as rowCount FROM parquet_scan('s3://bucket/data/2022-01-01.parquet');`,

0 commit comments

Comments
 (0)