Skip to content

Commit 628ce5d

Browse files
committed
Add excel shorthand for Excel-friendly UTF-8 exports.
Microsoft Excel on Windows does not recognise a UTF-8 CSV unless it has a byte-order mark, CRLF line endings, and an explicit UTF-8 declaration. Users have had to remember and set all three options individually each time. This is a recurring source of "opens as mojibake" reports. Add a single `excel` config key (default false). When enabled, the view forces `bom => true`, `eol => "\r\n"`, and `csvEncoding => 'UTF-8'` at serialize time. The preset wins for those three keys; other CSV options (delimiter, enclosure, header, extract, setSeparator, etc.) are independent and behave normally. The preset runs in `_serialize()` rather than `initialize()` so it takes effect regardless of when `excel` is set, including the common test pattern of constructing the view and then calling `setConfig()`. README documents the new option under a new "Excel-friendly UTF-8 export" heading in the Usage section.
1 parent 9cfd520 commit 628ce5d

3 files changed

Lines changed: 103 additions & 0 deletions

File tree

README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -263,6 +263,36 @@ The currently supported encoding extensions are as follows:
263263
- `iconv`
264264
- `mbstring`
265265

266+
#### Excel-friendly UTF-8 export
267+
268+
Microsoft Excel on Windows does not recognise a UTF-8 CSV unless it has a
269+
byte-order mark, CRLF line endings, and an explicit UTF-8 declaration. Setting
270+
all three options individually each time is repetitive and easy to get wrong.
271+
272+
The `excel` shorthand sets the right defaults in one go:
273+
274+
```php
275+
$this->viewBuilder()
276+
->setClassName('CsvView.Csv')
277+
->setOptions([
278+
'serialize' => 'data',
279+
'excel' => true,
280+
]);
281+
```
282+
283+
`excel => true` is equivalent to:
284+
285+
```php
286+
'bom' => true,
287+
'eol' => "\r\n",
288+
'csvEncoding' => 'UTF-8',
289+
```
290+
291+
The shorthand always wins for the three keys it controls; if you need a
292+
different combination (e.g. UTF-16, no BOM) do not enable `excel` and set the
293+
individual keys yourself instead. Other CSV options (`delimiter`, `enclosure`,
294+
`setSeparator`, `header`, `extract`, etc.) are independent and behave normally.
295+
266296
#### Setting the downloaded file name
267297

268298
By default, the downloaded file will be named after the last segment of the URL

src/View/CsvView.php

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,10 @@ class CsvView extends SerializedView
144144
* - 'csvEncoding': (default 'UTF-8') CSV file encoding
145145
* - 'dataEncoding': (default 'UTF-8') Encoding of data to be serialized
146146
* - 'transcodingExtension': (default 'iconv') PHP extension to use for character encoding conversion
147+
* - 'excel': (default false) Shorthand for an Excel-friendly UTF-8 export.
148+
* When true, sets `bom => true`, `eol => "\r\n"`, and `csvEncoding => 'UTF-8'`.
149+
* These specific keys are forced; if you need a different combination
150+
* do not enable `excel` and set them individually instead.
147151
*
148152
* @var array<string, mixed>
149153
*/
@@ -163,6 +167,7 @@ class CsvView extends SerializedView
163167
'csvEncoding' => 'UTF-8',
164168
'dataEncoding' => 'UTF-8',
165169
'transcodingExtension' => self::EXTENSION_ICONV,
170+
'excel' => false,
166171
];
167172

168173
/**
@@ -210,6 +215,7 @@ public static function contentType(): string
210215
protected function _serialize(array|string $serialize): string
211216
{
212217
$this->resetState();
218+
$this->_applyExcelPreset();
213219

214220
$this->_renderRow($this->getConfig('header'));
215221
$this->_renderContent();
@@ -246,6 +252,29 @@ public function __destruct()
246252
}
247253
}
248254

255+
/**
256+
* Apply the `excel` shorthand if enabled: BOM + CRLF EOL + UTF-8 encoding,
257+
* the three options Excel needs to open a UTF-8 CSV correctly on Windows.
258+
*
259+
* Applied at serialize-time (rather than `initialize()`) so the preset
260+
* takes effect regardless of when `excel` is set — including the test
261+
* pattern of constructing the view and then calling `setConfig()`.
262+
*
263+
* @return void
264+
*/
265+
protected function _applyExcelPreset(): void
266+
{
267+
if (!$this->getConfig('excel')) {
268+
return;
269+
}
270+
271+
$this->setConfig([
272+
'bom' => true,
273+
'eol' => "\r\n",
274+
'csvEncoding' => 'UTF-8',
275+
]);
276+
}
277+
249278
/**
250279
* Renders the body of the data to the csv
251280
*

tests/TestCase/View/CsvViewTest.php

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -597,4 +597,48 @@ public function testRenderViaExtractArrayValueThrows()
597597
);
598598
}
599599
}
600+
601+
/**
602+
* `excel => true` is a shorthand that forces the three options Excel
603+
* needs to open a UTF-8 CSV correctly on Windows: BOM, CRLF line
604+
* endings, and UTF-8 encoding.
605+
*
606+
* @return void
607+
*/
608+
public function testExcelPresetEmitsBomCrlfAndUtf8()
609+
{
610+
$data = [['Möhre', 'café'], ['ü', 'ß']];
611+
$this->view->set(['data' => $data])
612+
->setConfig(['serialize' => 'data', 'excel' => true]);
613+
614+
$bom = chr(0xEF) . chr(0xBB) . chr(0xBF);
615+
$expected = $bom . 'Möhre,café' . "\r\n" . 'ü,ß' . "\r\n";
616+
617+
$this->assertSame($expected, $this->view->render());
618+
}
619+
620+
/**
621+
* The Excel preset wins for the three keys it controls even when the
622+
* user has explicitly set them to other values. `excel => true` is a
623+
* single switch; for a different combination set the individual keys
624+
* yourself instead of enabling the preset.
625+
*
626+
* @return void
627+
*/
628+
public function testExcelPresetOverridesIndividualKeys()
629+
{
630+
$data = [['a', 'b']];
631+
$this->view->set(['data' => $data])
632+
->setConfig([
633+
'serialize' => 'data',
634+
'excel' => true,
635+
'bom' => false,
636+
'eol' => "\n",
637+
]);
638+
639+
$output = $this->view->render();
640+
$bom = chr(0xEF) . chr(0xBB) . chr(0xBF);
641+
$this->assertStringStartsWith($bom, $output);
642+
$this->assertStringEndsWith("\r\n", $output);
643+
}
600644
}

0 commit comments

Comments
 (0)