You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Hyperparameters passed to the model at inference-time
58
58
hyperparameters:
59
59
max_new_tokens: 2000
60
-
temperature: 0.7
60
+
temperature: 0.6
61
61
top_p: 0.1
62
62
top_k: 40
63
63
typical_p: 1
64
64
min_p: 0
65
-
repetition_penalty: 1.2
65
+
repetition_penalty: 0.85
66
66
# The prompt given to the model for this task. {json_input} is automatically replaced by a stringified JSON array
67
67
# containing an entry for each segment extracted from the video. {json_schema} is automatically replaced by a
68
68
# JSON schema definition of the expected output the LLM should give.
@@ -71,14 +71,8 @@ lecture_llm_generator:
71
71
72
72
You are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
73
73
74
-
I have an input JSON file I need to process. It contains an array, where each element is a snippet of a lecture video. Each element contains the keys "start_time", which denotes the start time of the snippet in seconds after video start, a "transcript" of the spoken text, and "screen_text", the text on screen as detected by OCR. The transcript and screen_text might contain inaccuracies due to the nature of STT and OCR. The video was split into snippets by detecting when the screen changes by a significant amount. Please create a JSON file containing just an array of strings, the list of strings should consist of 1 to 5 bullet points to summarize the contents of the video. Each bullet point should at most be 2 sentences long. Remember to answer only with a JSON file.
75
-
76
-
Your answer should adhere to the following JSON schema:
77
-
```
78
-
{json_schema}
79
-
```
74
+
I have an input JSON file I need to process. It contains an array, where each element is a snippet of a lecture video. Each element contains the keys "start_time", which denotes the start time of the snippet in seconds after video start, a "transcript" of the spoken text, and "screen_text", the text on screen as detected by OCR. The transcript and screen_text might contain inaccuracies due to the nature of STT and OCR. The video was split into snippets by detecting when the screen changes by a significant amount. Please create a JSON object which contains a property for each segment in the input, where the property key is the start_time of the segment, and the value is a string containing your suggested title for that segment. Choose high-quality and concise titles. If you want two back-to-back snippet to be considered as the same chapter, give them the same title in your JSON array. Remember to answer only with a JSON file. This is the input JSON:
I have an input JSON file I need to process. It contains an array, where each element is a snippet of a lecture video. Each element contains the keys "start_time", which denotes the start time of the snippet in seconds after video start, a "transcript" of the spoken text, and "screen_text", the text on screen as detected by OCR. The transcript and screen_text might contain inaccuracies due to the nature of STT and OCR. The video was split into snippets by detecting when the screen changes by a significant amount. Please create a JSON file containing an array of elements, where each element represents the respective snippet from the input JSON. Each element should contain a title you'd give this snippet. Choose high-quality and concise titles. If you want two back-to-back snippet to be considered as the same chapter, give them the same title in your JSON array. Remember to answer only with a JSON file. This is the input JSON:
68
+
answer_schema=answer_model.model_json_schema()
68
69
69
-
Your response should be following this JSON schema: {json_schema}
0 commit comments