[bug] Silent failure when uploading PDF with Type1 fonts missing /ToUnicode map — no exception raised, gemini-3.x only

### Is this a client library issue or a product issue?

This is **both**, but the client library has an actionable gap: the SDK raises no exception, warning, or structured signal when the model silently fails to process an uploaded PDF. The call returns `200 OK` with a natural-language response asking the user to paste the document manually — indistinguishable from a successful response in automated pipelines. The underlying model-level regression is separately reported on the [Google AI Dev Forum](https://discuss.ai.google.dev/c/ai-studio/8).

---

#### Environment details

- **Programming language:** Python
- **OS:** Linux / macOS (reproduced on both)
- **Language runtime version:** Python 3.11+
- **Package version:** `google-generativeai` latest

---

#### Steps to reproduce

1. Take a PDF file whose Type1 fonts use a custom `/Encoding` with `/Differences` array but have **no `/ToUnicode` map** (e.g. any KID document generated by Neevia docCreator v4.5 — full file analysis in the Dev Forum post linked above).

2. Upload the file and call `generate_content()` targeting any Gemini 3.x model:

```python
import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY")

uploaded_file = genai.upload_file(
    path="LU0089290844_KID.pdf",
    mime_type="application/pdf"
)

model = genai.GenerativeModel(model_name="gemini-3.5-flash")
# also reproduced with: gemini-3.1-pro, gemini-3.1-flash-lite

response = model.generate_content([
    uploaded_file,
    "Extract all data from this PDF document."
])

print(response.text)
```

3. Observe that:
   - No exception is raised.
   - `response.text` contains a message such as *"It seems the text of the document was not included — please paste it directly."*
   - There is no structured field in the response to detect the failure programmatically.

4. Switch `model_name` to `"gemini-2.5-flash"` with identical code and the same file → correct extracted content is returned.

---

#### Expected behavior

- The model extracts the PDF content correctly (as `gemini-2.5-flash` does), **or**
- The SDK raises a warning / structured error signal when the uploaded file is not processed, so callers can detect and handle the failure in automated pipelines.

#### Actual behavior

The call succeeds with HTTP 200. The model silently ignores the PDF content and returns a natural-language fallback response. No exception, no warning, no detectable signal.

---

#### Additional context

I analysed 4 failing files and 2 working references. The pattern is fully reproducible:

| File | Producer | `/ToUnicode` missing | Result |
|------|----------|----------------------|--------|
| `LU0089290844_KID.pdf` | Neevia docCreator v4.5 | All Type1 fonts | ❌ Empty |
| `LU2533812058_KID.pdf` | Neevia docCreator v4.5 | All Type1 fonts | ❌ Empty |
| `LU2314312922_KID.pdf` | Neevia docCreator v4.5 | All Type1 fonts | ❌ Empty |
| `LU2526007799_KID.pdf` | Neevia docCreator v5.0 | /R39 on page 2 only | ⚠️ Partial — page 2 corrupted |
| `PRIIP_KID_F0GBR04BQM_299.pdf` | Neevia docCreator v5.0 | None | ✅ OK |

All files produced by Neevia docCreator v4.5 systematically omit `/ToUnicode` on Type1 fonts with custom encoding. Without `/ToUnicode`, a conforming PDF text extractor (ISO 32000) cannot map glyph codes to Unicode and reads the document as empty. `gemini-2.5-flash` and libraries like `pypdf` handle this correctly by falling back to glyph names in `/Differences` via the Adobe Glyph List. Gemini 3.x does not apply this fallback.

**SDK-level ask:** even if the model fix must happen on the product side, the library could optionally add a pre-flight check warning callers when a PDF's fonts lack `/ToUnicode` maps, preventing silent failures in production pipelines.

Happy to share the PDF files or further details if helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Silent failure when uploading PDF with Type1 fonts missing /ToUnicode map — no exception raised, gemini-3.x only #2482

Is this a client library issue or a product issue?

Environment details

Steps to reproduce

Expected behavior

Actual behavior

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

File	Producer	`/ToUnicode` missing	Result
`LU0089290844_KID.pdf`	Neevia docCreator v4.5	All Type1 fonts	❌ Empty
`LU2533812058_KID.pdf`	Neevia docCreator v4.5	All Type1 fonts	❌ Empty
`LU2314312922_KID.pdf`	Neevia docCreator v4.5	All Type1 fonts	❌ Empty
`LU2526007799_KID.pdf`	Neevia docCreator v5.0	/R39 on page 2 only	⚠️ Partial — page 2 corrupted
`PRIIP_KID_F0GBR04BQM_299.pdf`	Neevia docCreator v5.0	None	✅ OK

[bug] Silent failure when uploading PDF with Type1 fonts missing /ToUnicode map — no exception raised, gemini-3.x only #2482

Description

Is this a client library issue or a product issue?

Environment details

Steps to reproduce

Expected behavior

Actual behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions