You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/main/java/org/apache/commons/csv/ExtendedBufferedReader.java
+42-34Lines changed: 42 additions & 34 deletions
Original file line number
Diff line number
Diff line change
@@ -37,26 +37,30 @@
37
37
/**
38
38
* A special buffered reader which supports sophisticated read access.
39
39
* <p>
40
-
* In particular the reader supports a look-ahead option, which allows you to see the next char returned by
41
-
* {@link #read()}. This reader also tracks how many characters have been read with {@link #getPosition()}.
40
+
* In particular the reader supports a look-ahead option, which allows you to see the next char returned by {@link #read()}. This reader also tracks how many
41
+
* characters have been read with {@link #getPosition()}.
/** The position, which is the number of characters read so far */
55
57
privatelongposition;
58
+
56
59
privatelongpositionMark;
57
60
58
61
/** The number of bytes read so far. */
59
62
privatelongbytesRead;
63
+
60
64
privatelongbytesReadMark;
61
65
62
66
/** Encoder for calculating the number of bytes for each character read. */
@@ -70,12 +74,11 @@ final class ExtendedBufferedReader extends UnsynchronizedBufferedReader {
70
74
}
71
75
72
76
/**
73
-
* Constructs a new instance with the specified reader, character set,
74
-
* and byte tracking option. Initializes an encoder if byte tracking is enabled
75
-
* and a character set is provided.
77
+
* Constructs a new instance with the specified reader, character set, and byte tracking option. Initializes an encoder if byte tracking is enabled and a
78
+
* character set is provided.
76
79
*
77
-
* @param reader the reader supports a look-ahead option.
78
-
* @param charset the character set for encoding, or {@code null} if not applicable.
80
+
* @param reader the reader supports a look-ahead option.
81
+
* @param charset the character set for encoding, or {@code null} if not applicable.
79
82
* @param trackBytes {@code true} to enable byte tracking; {@code false} to disable it.
* Gets the byte length of the given character based on the original Unicode
110
-
* specification, which defined characters as fixed-width 16-bit entities.
119
+
* Gets the byte length of the given character based on the original Unicode specification, which defined characters as fixed-width 16-bit entities.
111
120
* <p>
112
121
* The Unicode characters are divided into two main ranges:
113
122
* <ul>
114
-
* <li><strong>U+0000 to U+FFFF (Basic Multilingual Plane, BMP):</strong>
115
-
* <ul>
116
-
* <li>Represented using a single 16-bit {@code char}.</li>
117
-
* <li>Includes UTF-8 encodings of 1-byte, 2-byte, and some 3-byte characters.</li>
118
-
* </ul>
119
-
* </li>
120
-
* <li><strong>U+10000 to U+10FFFF (Supplementary Characters):</strong>
121
-
* <ul>
122
-
* <li>Represented as a pair of {@code char}s:</li>
123
-
* <li>The first {@code char} is from the high-surrogates range (\uD800-\uDBFF).</li>
124
-
* <li>The second {@code char} is from the low-surrogates range (\uDC00-\uDFFF).</li>
125
-
* <li>Includes UTF-8 encodings of some 3-byte characters and all 4-byte characters.</li>
126
-
* </ul>
127
-
* </li>
123
+
* <li><strong>U+0000 to U+FFFF (Basic Multilingual Plane, BMP):</strong>
124
+
* <ul>
125
+
* <li>Represented using a single 16-bit {@code char}.</li>
126
+
* <li>Includes UTF-8 encodings of 1-byte, 2-byte, and some 3-byte characters.</li>
127
+
* </ul>
128
+
* </li>
129
+
* <li><strong>U+10000 to U+10FFFF (Supplementary Characters):</strong>
130
+
* <ul>
131
+
* <li>Represented as a pair of {@code char}s:</li>
132
+
* <li>The first {@code char} is from the high-surrogates range (\uD800-\uDBFF).</li>
133
+
* <li>The second {@code char} is from the low-surrogates range (\uDC00-\uDFFF).</li>
134
+
* <li>Includes UTF-8 encodings of some 3-byte characters and all 4-byte characters.</li>
135
+
* </ul>
136
+
* </li>
128
137
* </ul>
129
138
*
130
139
* @param current the current character to process.
@@ -148,10 +157,9 @@ private int getEncodedCharLength(final int current) throws CharacterCodingExcept
148
157
}
149
158
150
159
/**
151
-
* Returns the last character that was read as an integer (0 to 65535). This will be the last character returned by
152
-
* any of the read methods. This will not include a character read using the {@link #peek()} method. If no
153
-
* character has been read then this will return {@link Constants#UNDEFINED}. If the end of the stream was reached
154
-
* on the last read then this will return {@link IOUtils#EOF}.
160
+
* Returns the last character that was read as an integer (0 to 65535). This will be the last character returned by any of the read methods. This will not
161
+
* include a character read using the {@link #peek()} method. If no character has been read then this will return {@link Constants#UNDEFINED}. If the end of
162
+
* the stream was reached on the last read then this will return {@link IOUtils#EOF}.
155
163
*
156
164
* @return the last character that was read
157
165
*/
@@ -193,8 +201,7 @@ public void mark(final int readAheadLimit) throws IOException {
193
201
@Override
194
202
publicintread() throwsIOException {
195
203
finalintcurrent = super.read();
196
-
if (current == CR || current == LF && lastChar != CR ||
* Gets the next line, dropping the line terminator(s). This method should only be called when processing a
235
-
* comment, otherwise, information can be lost.
244
+
* Gets the next line, dropping the line terminator(s). This method should only be called when processing a comment, otherwise, information can be lost.
236
245
* <p>
237
246
* Increments {@link #lineNumber} and updates {@link #position}.
238
247
* </p>
@@ -272,5 +281,4 @@ public void reset() throws IOException {
0 commit comments