Skip to content

Commit 155aea7

Browse files
committed
moving and fixing kotlin-spark example to /projects
1 parent b04aaf4 commit 155aea7

33 files changed

Lines changed: 1006 additions & 21 deletions

docs/StardustDocs/topics/concepts/concepts.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ This is why it was designed to be hierarchical and allows nesting of columns and
4444
* [**Interoperable**](collectionsInterop.md) — convertable with Kotlin data classes and collections.
4545
This also means conversion to/from other libraries' data structures is usually quite straightforward!
4646
See our [examples](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources)
47-
for some conversions between DataFrame and [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark), [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik), and [JetBrains Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed).
47+
for some conversions between DataFrame and [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/projects/spark-parquet-dataframe), [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik), and [JetBrains Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed).
4848
* **Generic** — can store objects of any type, not only numbers or strings.
4949
* **Typesafe** — the Kotlin DataFrame library provides a mechanism of on-the-fly [**generation of extension properties**](extensionPropertiesApi.md)
5050
that correspond to the columns of a dataframe.

docs/StardustDocs/topics/dataSources/Integrations.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Below is a list of example integrations with other data frameworks.
1919
These examples demonstrate how to bridge Kotlin DataFrame with external libraries or APIs.
2020

2121
- [Kotlin Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed)
22-
- [Apache Spark (with/without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark)
22+
- [Apache Spark (with/without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/projects/spark-parquet-dataframe)
2323
- [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik)
2424

2525
You can use these examples as templates to create your own integrations

docs/StardustDocs/topics/guides/Guides-And-Examples.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ and make working with your data both convenient and type-safe.
5858
* [Using Unsupported Data Sources](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples):
5959
— A guide by examples. While these might one day become proper integrations of DataFrame, for now,
6060
we provide them as examples for how to make such integrations yourself.
61-
* [Apache Spark Interop (With and Without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark)
61+
* [Apache Spark Interop (With and Without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/projects/spark-parquet-dataframe)
6262
* [Multik Interop](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik)
6363
* [JetBrains Exposed Interop](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed)
6464
* [Hibernate ORM](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/hibernate)

examples/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ They show how to convert to and from Kotlin DataFrame and their respective table
2121
for an example of using Kotlin DataFrame with [Exposed](https://github.com/JetBrains/Exposed).
2222
* **Hibernate**: See the [hibernate folder](./idea-examples/unsupported-data-sources/hibernate)
2323
for an example of using Kotlin DataFrame with [Hibernate](https://hibernate.org/orm/).
24-
* **Apache Spark**: See the [spark folder](./idea-examples/unsupported-data-sources/spark)
24+
* **Apache Spark**: See the [spark folder](./projects/kotlin-spark)
2525
for an example of using Kotlin DataFrame with [Spark](https://spark.apache.org/) and with the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
2626
* **Multik**: See the [multik folder](./idea-examples/unsupported-data-sources/multik)
2727
for an example of using Kotlin DataFrame with [Multik](https://github.com/Kotlin/multik).
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
root = true
2+
3+
[*]
4+
charset = utf-8
5+
end_of_line = lf
6+
insert_final_newline = true
7+
indent_style = space
8+
indent_size = 4
9+
max_line_length = 120
10+
11+
[*.json]
12+
indent_size = 2
13+
14+
[{*.yaml,*.yml}]
15+
indent_size = 2
16+
17+
[*.ipynb]
18+
insert_final_newline = false
19+
20+
[*.{kt,kts}]
21+
ij_kotlin_code_style_defaults = KOTLIN_OFFICIAL
22+
23+
# Disable wildcard imports entirely
24+
ij_kotlin_name_count_to_use_star_import = 2147483647
25+
ij_kotlin_name_count_to_use_star_import_for_members = 2147483647
26+
ij_kotlin_packages_to_use_import_on_demand = unset
27+
28+
ktlint_code_style = ktlint_official
29+
ktlint_experimental = enabled
30+
ktlint_standard_filename = disabled
31+
ktlint_standard_no-empty-first-line-in-class-body = disabled
32+
ktlint_class_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
33+
ktlint_function_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
34+
ktlint_standard_chain-method-continuation = disabled
35+
ktlint_ignore_back_ticked_identifier = true
36+
ktlint_standard_multiline-expression-wrapping = disabled
37+
ktlint_standard_when-entry-bracing = disabled
38+
ktlint_standard_expression-operand-wrapping = disabled
39+
40+
[{*/build/**/*,**/*keywords*/**,**/*.Generated.kt,**/*$Extensions.kt,**/BuildConfig.kt}]
41+
ktlint = disabled
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Apache Spark
2+
3+
Showcase of how to use DataFrame [Apache Spark](https://spark.apache.org/) and
4+
the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
5+
6+
Even though Spark is not officially supported as a data source in DataFrame,
7+
this project shows how to convert from and to Spark tables.
8+
9+
This project uses the
10+
[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
11+
12+
We recommend using an up-to-date IntelliJ IDEA for the best experience,
13+
as well as the latest Kotlin plugin version.
14+
15+
> [!WARNING]
16+
> For proper functionality in IntelliJ IDEA requires version 2025.2 or newer.
17+
18+
[Download this Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/kotlin-spark.zip)

examples/idea-examples/unsupported-data-sources/spark/build.gradle.kts renamed to examples/projects/dev/kotlin-spark/build.gradle.kts

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,25 @@
11
import org.jetbrains.kotlin.gradle.dsl.JvmTarget
22

33
plugins {
4-
application
5-
kotlin("jvm")
6-
7-
// uses the 'old' Gradle plugin instead of the compiler plugin for now
8-
id("org.jetbrains.kotlinx.dataframe")
4+
alias(libs.plugins.kotlin.jvm)
5+
alias(libs.plugins.kotlin.dataframe)
6+
alias(libs.plugins.ktlint.gradle)
97

10-
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
11-
id("com.google.devtools.ksp")
8+
application
129
}
1310

1411
repositories {
15-
mavenLocal() // in case of local dataframe development
1612
mavenCentral()
1713
}
1814

1915
dependencies {
20-
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
21-
implementation(project(":"))
16+
implementation(libs.dataframe)
2217

23-
// (kotlin) spark support
18+
// (Kotlin) Spark SQL (Spark 3.3.2)
2419
implementation(libs.kotlin.spark)
25-
compileOnly(libs.spark)
20+
compileOnly(libs.spark.sql)
21+
22+
// Logging to keep Spark quiet
2623
implementation(libs.log4j.core)
2724
implementation(libs.log4j.api)
2825
}
@@ -64,6 +61,7 @@ val runSparkUntypedDataset by tasks.registering(JavaExec::class) {
6461
}
6562

6663
kotlin {
64+
jvmToolchain(11)
6765
compilerOptions {
6866
jvmTarget = JvmTarget.JVM_11
6967
freeCompilerArgs.add("-Xjdk-release=11")
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
org.gradle.jvmargs=-Xmx1g -Dfile.encoding=UTF-8
2+
kotlin.code.style=official
3+
# Disabling incremental compilation will no longer be necessary
4+
# when https://youtrack.jetbrains.com/issue/KT-66735 is resolved.
5+
kotlin.incremental=false
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
[versions]
2+
kotlin = "2.3.21"
3+
dataframe = "1.0.0-Beta5"
4+
ktlint-gradle = "14.0.1"
5+
ktlint = "1.8.0"
6+
log4j = "2.25.4"
7+
8+
# check the versions down in the [libraries] section too!
9+
kotlin-spark = "1.2.4"
10+
spark3 = "3.3.2"
11+
12+
[libraries]
13+
dataframe = { module = "org.jetbrains.kotlinx:dataframe", version.ref = "dataframe" }
14+
log4j-core = { group = "org.apache.logging.log4j", name = "log4j-core", version.ref = "log4j" }
15+
log4j-api = { group = "org.apache.logging.log4j", name = "log4j-api", version.ref = "log4j" }
16+
kotlin-spark = { group = "org.jetbrains.kotlinx.spark", name = "kotlin-spark-api_3.3.2_2.13", version.ref = "kotlin-spark" }
17+
spark-sql = { group = "org.apache.spark", name = "spark-sql_2.13", version.ref = "spark3" }
18+
19+
[plugins]
20+
kotlin-jvm = { id = "org.jetbrains.kotlin.jvm", version.ref = "kotlin" }
21+
ktlint-gradle = { id = "org.jlleitschuh.gradle.ktlint", version.ref = "ktlint-gradle" }
22+
23+
# The Kotlin DataFrame Compiler plugin is the same version as the Kotlin plugin.
24+
kotlin-dataframe = { id = "org.jetbrains.kotlin.plugin.dataframe", version.ref = "kotlin" }
Binary file not shown.

0 commit comments

Comments
 (0)