This library provides tools for evaluating and tracing AI applications in Braintrust. Use it to:
- Evaluate your AI models with custom test cases and scoring functions
- Trace LLM calls and monitor AI application performance with OpenTelemetry
- Integrate seamlessly with OpenAI, Anthropic, and other LLM providers
This SDK is currently in BETA status and APIs may change.
The fastest way to report data to Braintrust is to add the braintrust java agent to your jvm startup args.
springboot+gradle example:
configurations {
btAgent
}
dependencies {
implementation 'org.springframework.boot:spring-boot-starter-web'
btAgent "dev.braintrust:braintrust-java-agent:<version-goes-here>"
}
bootRun {
jvmArgs = [
// NOTE: if you're running with other java agents, add the braintrust agent last
"-javaagent:${configurations.btAgent.singleFile.absolutePath}",
]
}This will automatically instrument major AI clients and frameworks. No code changes required. A list of supported instrumentation can be found here
NOTE: Additional steps may be required for users running with other -javaagent flags. Consult braintrust docs for details.
Add the Braintrust SDK to your package manager.
gradle example:
dependencies {
implementation 'dev.braintrust:braintrust-sdk-java:<version-goes-here>'
}Use the SDK to create and send your eval:
var braintrust = Braintrust.get();
braintrust.openTelemetryCreate();
var openAIClient = OpenAIOkHttpClient.fromEnv();
Function<String, String> getFoodType =
(String food) -> {
var request =
ChatCompletionCreateParams.builder()
.model(ChatModel.GPT_4O_MINI)
.addSystemMessage("Return a one word answer")
.addUserMessage("What kind of food is " + food + "?")
.build();
var response = openAIClient.chat().completions().create(request);
return response.choices().get(0).message().content().orElse("").toLowerCase();
};
var eval = braintrust.<String, String>evalBuilder()
.name("java-eval-x-" + System.currentTimeMillis())
.cases(
DatasetCase.of("asparagus", "vegetable"),
DatasetCase.of("banana", "fruit"))
.taskFunction(getFoodType)
.scorers(
Scorer.of(
"exact_match",
(expected, result) -> expected.equals(result) ? 1.0 : 0.0))
.build();
var result = eval.run();
System.out.println("\n\n" + result.createReportString());Alternatively, sdk users can manually apply instrumentation instead of using the java agent.
var braintrust = Braintrust.get();
var openTelemetry = braintrust.openTelemetryCreate();
OpenAIClient openAIClient = BraintrustOpenAI.wrapOpenAI(openTelemetry, OpenAIOkHttpClient.fromEnv());
var request =
ChatCompletionCreateParams.builder()
.model(ChatModel.GPT_4O_MINI)
.addUserMessage("What is the capital of France?")
.build();
// openai calls will be traced and reported to braintrust
var response = openAIClient.chat().completions().create(request);A list of supported instrumentation can be found here
Example source code can be found here
export BRAINTRUST_API_KEY="your-braintrust-api-key"
export OPENAI_API_KEY="your-oai-api-key" # to run oai examples
export ANTHROPIC_API_KEY="your-anthropic-api-key" # to run anthropic examples
# install java 17 or later
brew install openjdk@17 # macOS
sudo apt install openjdk-17-jdk # ubuntu
# to run a specific example
./gradlew :examples:runSimpleOpenTelemetry
# to see all examples
./gradlew :examples:tasks --group="Braintrust SDK Examples"The SDK uses a standard slf4j logger and will use the default log level (or not log at all if slf4j is not installed).
All Braintrust loggers will log into the dev.braintrust namespace. To adjust the log level, consult your logger documentation.
For example, to enable debug logging for slf4j-simple you would set the system property org.slf4j.simpleLogger.log.dev.braintrust=DEBUG