1 /* 2 * Copyright (C) 2014 The Android Open Source Project 3 * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. 4 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 5 * 6 * This code is free software; you can redistribute it and/or modify it 7 * under the terms of the GNU General Public License version 2 only, as 8 * published by the Free Software Foundation. Oracle designates this 9 * particular file as subject to the "Classpath" exception as provided 10 * by Oracle in the LICENSE file that accompanied this code. 11 * 12 * This code is distributed in the hope that it will be useful, but WITHOUT 13 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 14 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 15 * version 2 for more details (a copy is included in the LICENSE file that 16 * accompanied this code). 17 * 18 * You should have received a copy of the GNU General Public License version 19 * 2 along with this work; if not, write to the Free Software Foundation, 20 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 21 * 22 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 23 * or visit www.oracle.com if you need additional information or have any 24 * questions. 25 */ 26 27 // -- This file was mechanically generated: Do not edit! -- // 28 29 package java.nio.charset; 30 31 import java.nio.Buffer; 32 import java.nio.ByteBuffer; 33 import java.nio.CharBuffer; 34 import java.nio.BufferOverflowException; 35 import java.nio.BufferUnderflowException; 36 import java.lang.ref.WeakReference; 37 import java.nio.charset.CoderMalfunctionError; // javadoc 38 import java.util.Arrays; 39 40 41 /** 42 * An engine that can transform a sequence of bytes in a specific charset into a sequence of 43 * sixteen-bit Unicode characters. 44 * 45 * <a name="steps"></a> 46 * 47 * <p> The input byte sequence is provided in a byte buffer or a series 48 * of such buffers. The output character sequence is written to a character buffer 49 * or a series of such buffers. A decoder should always be used by making 50 * the following sequence of method invocations, hereinafter referred to as a 51 * <i>decoding operation</i>: 52 * 53 * <ol> 54 * 55 * <li><p> Reset the decoder via the {@link #reset reset} method, unless it 56 * has not been used before; </p></li> 57 * 58 * <li><p> Invoke the {@link #decode decode} method zero or more times, as 59 * long as additional input may be available, passing <tt>false</tt> for the 60 * <tt>endOfInput</tt> argument and filling the input buffer and flushing the 61 * output buffer between invocations; </p></li> 62 * 63 * <li><p> Invoke the {@link #decode decode} method one final time, passing 64 * <tt>true</tt> for the <tt>endOfInput</tt> argument; and then </p></li> 65 * 66 * <li><p> Invoke the {@link #flush flush} method so that the decoder can 67 * flush any internal state to the output buffer. </p></li> 68 * 69 * </ol> 70 * 71 * Each invocation of the {@link #decode decode} method will decode as many 72 * bytes as possible from the input buffer, writing the resulting characters 73 * to the output buffer. The {@link #decode decode} method returns when more 74 * input is required, when there is not enough room in the output buffer, or 75 * when a decoding error has occurred. In each case a {@link CoderResult} 76 * object is returned to describe the reason for termination. An invoker can 77 * examine this object and fill the input buffer, flush the output buffer, or 78 * attempt to recover from a decoding error, as appropriate, and try again. 79 * 80 * <a name="ce"></a> 81 * 82 * <p> There are two general types of decoding errors. If the input byte 83 * sequence is not legal for this charset then the input is considered <i>malformed</i>. If 84 * the input byte sequence is legal but cannot be mapped to a valid 85 * Unicode character then an <i>unmappable character</i> has been encountered. 86 * 87 * <a name="cae"></a> 88 * 89 * <p> How a decoding error is handled depends upon the action requested for 90 * that type of error, which is described by an instance of the {@link 91 * CodingErrorAction} class. The possible error actions are to {@linkplain 92 * CodingErrorAction#IGNORE ignore} the erroneous input, {@linkplain 93 * CodingErrorAction#REPORT report} the error to the invoker via 94 * the returned {@link CoderResult} object, or {@linkplain CodingErrorAction#REPLACE 95 * replace} the erroneous input with the current value of the 96 * replacement string. The replacement 97 * 98 99 100 101 102 103 * has the initial value <tt>"\uFFFD"</tt>; 104 105 * 106 * its value may be changed via the {@link #replaceWith(java.lang.String) 107 * replaceWith} method. 108 * 109 * <p> The default action for malformed-input and unmappable-character errors 110 * is to {@linkplain CodingErrorAction#REPORT report} them. The 111 * malformed-input error action may be changed via the {@link 112 * #onMalformedInput(CodingErrorAction) onMalformedInput} method; the 113 * unmappable-character action may be changed via the {@link 114 * #onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter} method. 115 * 116 * <p> This class is designed to handle many of the details of the decoding 117 * process, including the implementation of error actions. A decoder for a 118 * specific charset, which is a concrete subclass of this class, need only 119 * implement the abstract {@link #decodeLoop decodeLoop} method, which 120 * encapsulates the basic decoding loop. A subclass that maintains internal 121 * state should, additionally, override the {@link #implFlush implFlush} and 122 * {@link #implReset implReset} methods. 123 * 124 * <p> Instances of this class are not safe for use by multiple concurrent 125 * threads. </p> 126 * 127 * 128 * @author Mark Reinhold 129 * @author JSR-51 Expert Group 130 * @since 1.4 131 * 132 * @see ByteBuffer 133 * @see CharBuffer 134 * @see Charset 135 * @see CharsetEncoder 136 */ 137 138 public abstract class CharsetDecoder { 139 140 private final Charset charset; 141 private final float averageCharsPerByte; 142 private final float maxCharsPerByte; 143 144 private String replacement; 145 private CodingErrorAction malformedInputAction 146 = CodingErrorAction.REPORT; 147 private CodingErrorAction unmappableCharacterAction 148 = CodingErrorAction.REPORT; 149 150 // Internal states 151 // 152 private static final int ST_RESET = 0; 153 private static final int ST_CODING = 1; 154 private static final int ST_END = 2; 155 private static final int ST_FLUSHED = 3; 156 157 private int state = ST_RESET; 158 159 private static String stateNames[] 160 = { "RESET", "CODING", "CODING_END", "FLUSHED" }; 161 162 163 /** 164 * Initializes a new decoder. The new decoder will have the given 165 * chars-per-byte and replacement values. 166 * 167 * @param cs 168 * The charset that created this decoder 169 * 170 * @param averageCharsPerByte 171 * A positive float value indicating the expected number of 172 * characters that will be produced for each input byte 173 * 174 * @param maxCharsPerByte 175 * A positive float value indicating the maximum number of 176 * characters that will be produced for each input byte 177 * 178 * @param replacement 179 * The initial replacement; must not be <tt>null</tt>, must have 180 * non-zero length, must not be longer than maxCharsPerByte, 181 * and must be {@linkplain #isLegalReplacement legal} 182 * 183 * @throws IllegalArgumentException 184 * If the preconditions on the parameters do not hold 185 */ 186 private CharsetDecoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte, String replacement)187 CharsetDecoder(Charset cs, 188 float averageCharsPerByte, 189 float maxCharsPerByte, 190 String replacement) 191 { 192 this.charset = cs; 193 if (averageCharsPerByte <= 0.0f) 194 throw new IllegalArgumentException("Non-positive " 195 + "averageCharsPerByte"); 196 if (maxCharsPerByte <= 0.0f) 197 throw new IllegalArgumentException("Non-positive " 198 + "maxCharsPerByte"); 199 if (!Charset.atBugLevel("1.4")) { 200 if (averageCharsPerByte > maxCharsPerByte) 201 throw new IllegalArgumentException("averageCharsPerByte" 202 + " exceeds " 203 + "maxCharsPerByte"); 204 } 205 this.replacement = replacement; 206 this.averageCharsPerByte = averageCharsPerByte; 207 this.maxCharsPerByte = maxCharsPerByte; 208 // Android-removed 209 // replaceWith(replacement); 210 } 211 212 /** 213 * Initializes a new decoder. The new decoder will have the given 214 * chars-per-byte values and its replacement will be the 215 * string <tt>"\uFFFD"</tt>. 216 * 217 * @param cs 218 * The charset that created this decoder 219 * 220 * @param averageCharsPerByte 221 * A positive float value indicating the expected number of 222 * characters that will be produced for each input byte 223 * 224 * @param maxCharsPerByte 225 * A positive float value indicating the maximum number of 226 * characters that will be produced for each input byte 227 * 228 * @throws IllegalArgumentException 229 * If the preconditions on the parameters do not hold 230 */ CharsetDecoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte)231 protected CharsetDecoder(Charset cs, 232 float averageCharsPerByte, 233 float maxCharsPerByte) 234 { 235 this(cs, 236 averageCharsPerByte, maxCharsPerByte, 237 "\uFFFD"); 238 } 239 240 /** 241 * Returns the charset that created this decoder. 242 * 243 * @return This decoder's charset 244 */ charset()245 public final Charset charset() { 246 return charset; 247 } 248 249 /** 250 * Returns this decoder's replacement value. 251 * 252 * @return This decoder's current replacement, 253 * which is never <tt>null</tt> and is never empty 254 */ replacement()255 public final String replacement() { 256 257 return replacement; 258 259 260 261 262 } 263 264 /** 265 * Changes this decoder's replacement value. 266 * 267 * <p> This method invokes the {@link #implReplaceWith implReplaceWith} 268 * method, passing the new replacement, after checking that the new 269 * replacement is acceptable. </p> 270 * 271 * @param newReplacement The replacement value 272 * 273 274 * The new replacement; must not be <tt>null</tt> 275 * and must have non-zero length 276 277 278 279 280 281 282 283 * 284 * @return This decoder 285 * 286 * @throws IllegalArgumentException 287 * If the preconditions on the parameter do not hold 288 */ replaceWith(String newReplacement)289 public final CharsetDecoder replaceWith(String newReplacement) { 290 if (newReplacement == null) 291 throw new IllegalArgumentException("Null replacement"); 292 int len = newReplacement.length(); 293 if (len == 0) 294 throw new IllegalArgumentException("Empty replacement"); 295 if (len > maxCharsPerByte) 296 throw new IllegalArgumentException("Replacement too long"); 297 298 this.replacement = newReplacement; 299 300 301 302 303 304 305 implReplaceWith(this.replacement); 306 return this; 307 } 308 309 /** 310 * Reports a change to this decoder's replacement value. 311 * 312 * <p> The default implementation of this method does nothing. This method 313 * should be overridden by decoders that require notification of changes to 314 * the replacement. </p> 315 * 316 * @param newReplacement The replacement value 317 */ implReplaceWith(String newReplacement)318 protected void implReplaceWith(String newReplacement) { 319 } 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 /** 362 * Returns this decoder's current action for malformed-input errors. 363 * 364 * @return The current malformed-input action, which is never <tt>null</tt> 365 */ malformedInputAction()366 public CodingErrorAction malformedInputAction() { 367 return malformedInputAction; 368 } 369 370 /** 371 * Changes this decoder's action for malformed-input errors. 372 * 373 * <p> This method invokes the {@link #implOnMalformedInput 374 * implOnMalformedInput} method, passing the new action. </p> 375 * 376 * @param newAction The new action; must not be <tt>null</tt> 377 * 378 * @return This decoder 379 * 380 * @throws IllegalArgumentException 381 * If the precondition on the parameter does not hold 382 */ onMalformedInput(CodingErrorAction newAction)383 public final CharsetDecoder onMalformedInput(CodingErrorAction newAction) { 384 if (newAction == null) 385 throw new IllegalArgumentException("Null action"); 386 malformedInputAction = newAction; 387 implOnMalformedInput(newAction); 388 return this; 389 } 390 391 /** 392 * Reports a change to this decoder's malformed-input action. 393 * 394 * <p> The default implementation of this method does nothing. This method 395 * should be overridden by decoders that require notification of changes to 396 * the malformed-input action. </p> 397 * 398 * @param newAction The new action 399 */ implOnMalformedInput(CodingErrorAction newAction)400 protected void implOnMalformedInput(CodingErrorAction newAction) { } 401 402 /** 403 * Returns this decoder's current action for unmappable-character errors. 404 * 405 * @return The current unmappable-character action, which is never 406 * <tt>null</tt> 407 */ unmappableCharacterAction()408 public CodingErrorAction unmappableCharacterAction() { 409 return unmappableCharacterAction; 410 } 411 412 /** 413 * Changes this decoder's action for unmappable-character errors. 414 * 415 * <p> This method invokes the {@link #implOnUnmappableCharacter 416 * implOnUnmappableCharacter} method, passing the new action. </p> 417 * 418 * @param newAction The new action; must not be <tt>null</tt> 419 * 420 * @return This decoder 421 * 422 * @throws IllegalArgumentException 423 * If the precondition on the parameter does not hold 424 */ onUnmappableCharacter(CodingErrorAction newAction)425 public final CharsetDecoder onUnmappableCharacter(CodingErrorAction 426 newAction) 427 { 428 if (newAction == null) 429 throw new IllegalArgumentException("Null action"); 430 unmappableCharacterAction = newAction; 431 implOnUnmappableCharacter(newAction); 432 return this; 433 } 434 435 /** 436 * Reports a change to this decoder's unmappable-character action. 437 * 438 * <p> The default implementation of this method does nothing. This method 439 * should be overridden by decoders that require notification of changes to 440 * the unmappable-character action. </p> 441 * 442 * @param newAction The new action 443 */ implOnUnmappableCharacter(CodingErrorAction newAction)444 protected void implOnUnmappableCharacter(CodingErrorAction newAction) { } 445 446 /** 447 * Returns the average number of characters that will be produced for each 448 * byte of input. This heuristic value may be used to estimate the size 449 * of the output buffer required for a given input sequence. 450 * 451 * @return The average number of characters produced 452 * per byte of input 453 */ averageCharsPerByte()454 public final float averageCharsPerByte() { 455 return averageCharsPerByte; 456 } 457 458 /** 459 * Returns the maximum number of characters that will be produced for each 460 * byte of input. This value may be used to compute the worst-case size 461 * of the output buffer required for a given input sequence. 462 * 463 * @return The maximum number of characters that will be produced per 464 * byte of input 465 */ maxCharsPerByte()466 public final float maxCharsPerByte() { 467 return maxCharsPerByte; 468 } 469 470 /** 471 * Decodes as many bytes as possible from the given input buffer, 472 * writing the results to the given output buffer. 473 * 474 * <p> The buffers are read from, and written to, starting at their current 475 * positions. At most {@link Buffer#remaining in.remaining()} bytes 476 * will be read and at most {@link Buffer#remaining out.remaining()} 477 * characters will be written. The buffers' positions will be advanced to 478 * reflect the bytes read and the characters written, but their marks and 479 * limits will not be modified. 480 * 481 * <p> In addition to reading bytes from the input buffer and writing 482 * characters to the output buffer, this method returns a {@link CoderResult} 483 * object to describe its reason for termination: 484 * 485 * <ul> 486 * 487 * <li><p> {@link CoderResult#UNDERFLOW} indicates that as much of the 488 * input buffer as possible has been decoded. If there is no further 489 * input then the invoker can proceed to the next step of the 490 * <a href="#steps">decoding operation</a>. Otherwise this method 491 * should be invoked again with further input. </p></li> 492 * 493 * <li><p> {@link CoderResult#OVERFLOW} indicates that there is 494 * insufficient space in the output buffer to decode any more bytes. 495 * This method should be invoked again with an output buffer that has 496 * more {@linkplain Buffer#remaining remaining} characters. This is 497 * typically done by draining any decoded characters from the output 498 * buffer. </p></li> 499 * 500 * <li><p> A {@linkplain CoderResult#malformedForLength 501 * malformed-input} result indicates that a malformed-input 502 * error has been detected. The malformed bytes begin at the input 503 * buffer's (possibly incremented) position; the number of malformed 504 * bytes may be determined by invoking the result object's {@link 505 * CoderResult#length() length} method. This case applies only if the 506 * {@linkplain #onMalformedInput malformed action} of this decoder 507 * is {@link CodingErrorAction#REPORT}; otherwise the malformed input 508 * will be ignored or replaced, as requested. </p></li> 509 * 510 * <li><p> An {@linkplain CoderResult#unmappableForLength 511 * unmappable-character} result indicates that an 512 * unmappable-character error has been detected. The bytes that 513 * decode the unmappable character begin at the input buffer's (possibly 514 * incremented) position; the number of such bytes may be determined 515 * by invoking the result object's {@link CoderResult#length() length} 516 * method. This case applies only if the {@linkplain #onUnmappableCharacter 517 * unmappable action} of this decoder is {@link 518 * CodingErrorAction#REPORT}; otherwise the unmappable character will be 519 * ignored or replaced, as requested. </p></li> 520 * 521 * </ul> 522 * 523 * In any case, if this method is to be reinvoked in the same decoding 524 * operation then care should be taken to preserve any bytes remaining 525 * in the input buffer so that they are available to the next invocation. 526 * 527 * <p> The <tt>endOfInput</tt> parameter advises this method as to whether 528 * the invoker can provide further input beyond that contained in the given 529 * input buffer. If there is a possibility of providing additional input 530 * then the invoker should pass <tt>false</tt> for this parameter; if there 531 * is no possibility of providing further input then the invoker should 532 * pass <tt>true</tt>. It is not erroneous, and in fact it is quite 533 * common, to pass <tt>false</tt> in one invocation and later discover that 534 * no further input was actually available. It is critical, however, that 535 * the final invocation of this method in a sequence of invocations always 536 * pass <tt>true</tt> so that any remaining undecoded input will be treated 537 * as being malformed. 538 * 539 * <p> This method works by invoking the {@link #decodeLoop decodeLoop} 540 * method, interpreting its results, handling error conditions, and 541 * reinvoking it as necessary. </p> 542 * 543 * 544 * @param in 545 * The input byte buffer 546 * 547 * @param out 548 * The output character buffer 549 * 550 * @param endOfInput 551 * <tt>true</tt> if, and only if, the invoker can provide no 552 * additional input bytes beyond those in the given buffer 553 * 554 * @return A coder-result object describing the reason for termination 555 * 556 * @throws IllegalStateException 557 * If a decoding operation is already in progress and the previous 558 * step was an invocation neither of the {@link #reset reset} 559 * method, nor of this method with a value of <tt>false</tt> for 560 * the <tt>endOfInput</tt> parameter, nor of this method with a 561 * value of <tt>true</tt> for the <tt>endOfInput</tt> parameter 562 * but a return value indicating an incomplete decoding operation 563 * 564 * @throws CoderMalfunctionError 565 * If an invocation of the decodeLoop method threw 566 * an unexpected exception 567 */ decode(ByteBuffer in, CharBuffer out, boolean endOfInput)568 public final CoderResult decode(ByteBuffer in, CharBuffer out, 569 boolean endOfInput) 570 { 571 int newState = endOfInput ? ST_END : ST_CODING; 572 if ((state != ST_RESET) && (state != ST_CODING) 573 && !(endOfInput && (state == ST_END))) 574 throwIllegalStateException(state, newState); 575 state = newState; 576 577 for (;;) { 578 579 CoderResult cr; 580 try { 581 cr = decodeLoop(in, out); 582 } catch (BufferUnderflowException x) { 583 throw new CoderMalfunctionError(x); 584 } catch (BufferOverflowException x) { 585 throw new CoderMalfunctionError(x); 586 } 587 588 if (cr.isOverflow()) 589 return cr; 590 591 if (cr.isUnderflow()) { 592 if (endOfInput && in.hasRemaining()) { 593 cr = CoderResult.malformedForLength(in.remaining()); 594 // Fall through to malformed-input case 595 } else { 596 return cr; 597 } 598 } 599 600 CodingErrorAction action = null; 601 if (cr.isMalformed()) 602 action = malformedInputAction; 603 else if (cr.isUnmappable()) 604 action = unmappableCharacterAction; 605 else 606 assert false : cr.toString(); 607 608 if (action == CodingErrorAction.REPORT) 609 return cr; 610 611 if (action == CodingErrorAction.REPLACE) { 612 if (out.remaining() < replacement.length()) 613 return CoderResult.OVERFLOW; 614 out.put(replacement); 615 } 616 617 if ((action == CodingErrorAction.IGNORE) 618 || (action == CodingErrorAction.REPLACE)) { 619 // Skip erroneous input either way 620 in.position(in.position() + cr.length()); 621 continue; 622 } 623 624 assert false; 625 } 626 627 } 628 629 /** 630 * Flushes this decoder. 631 * 632 * <p> Some decoders maintain internal state and may need to write some 633 * final characters to the output buffer once the overall input sequence has 634 * been read. 635 * 636 * <p> Any additional output is written to the output buffer beginning at 637 * its current position. At most {@link Buffer#remaining out.remaining()} 638 * characters will be written. The buffer's position will be advanced 639 * appropriately, but its mark and limit will not be modified. 640 * 641 * <p> If this method completes successfully then it returns {@link 642 * CoderResult#UNDERFLOW}. If there is insufficient room in the output 643 * buffer then it returns {@link CoderResult#OVERFLOW}. If this happens 644 * then this method must be invoked again, with an output buffer that has 645 * more room, in order to complete the current <a href="#steps">decoding 646 * operation</a>. 647 * 648 * <p> If this decoder has already been flushed then invoking this method 649 * has no effect. 650 * 651 * <p> This method invokes the {@link #implFlush implFlush} method to 652 * perform the actual flushing operation. </p> 653 * 654 * @param out 655 * The output character buffer 656 * 657 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 658 * {@link CoderResult#OVERFLOW} 659 * 660 * @throws IllegalStateException 661 * If the previous step of the current decoding operation was an 662 * invocation neither of the {@link #flush flush} method nor of 663 * the three-argument {@link 664 * #decode(ByteBuffer,CharBuffer,boolean) decode} method 665 * with a value of <tt>true</tt> for the <tt>endOfInput</tt> 666 * parameter 667 */ flush(CharBuffer out)668 public final CoderResult flush(CharBuffer out) { 669 if (state == ST_END) { 670 CoderResult cr = implFlush(out); 671 if (cr.isUnderflow()) 672 state = ST_FLUSHED; 673 return cr; 674 } 675 676 if (state != ST_FLUSHED) 677 throwIllegalStateException(state, ST_FLUSHED); 678 679 return CoderResult.UNDERFLOW; // Already flushed 680 } 681 682 /** 683 * Flushes this decoder. 684 * 685 * <p> The default implementation of this method does nothing, and always 686 * returns {@link CoderResult#UNDERFLOW}. This method should be overridden 687 * by decoders that may need to write final characters to the output buffer 688 * once the entire input sequence has been read. </p> 689 * 690 * @param out 691 * The output character buffer 692 * 693 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 694 * {@link CoderResult#OVERFLOW} 695 */ implFlush(CharBuffer out)696 protected CoderResult implFlush(CharBuffer out) { 697 return CoderResult.UNDERFLOW; 698 } 699 700 /** 701 * Resets this decoder, clearing any internal state. 702 * 703 * <p> This method resets charset-independent state and also invokes the 704 * {@link #implReset() implReset} method in order to perform any 705 * charset-specific reset actions. </p> 706 * 707 * @return This decoder 708 * 709 */ reset()710 public final CharsetDecoder reset() { 711 implReset(); 712 state = ST_RESET; 713 return this; 714 } 715 716 /** 717 * Resets this decoder, clearing any charset-specific internal state. 718 * 719 * <p> The default implementation of this method does nothing. This method 720 * should be overridden by decoders that maintain internal state. </p> 721 */ implReset()722 protected void implReset() { } 723 724 /** 725 * Decodes one or more bytes into one or more characters. 726 * 727 * <p> This method encapsulates the basic decoding loop, decoding as many 728 * bytes as possible until it either runs out of input, runs out of room 729 * in the output buffer, or encounters a decoding error. This method is 730 * invoked by the {@link #decode decode} method, which handles result 731 * interpretation and error recovery. 732 * 733 * <p> The buffers are read from, and written to, starting at their current 734 * positions. At most {@link Buffer#remaining in.remaining()} bytes 735 * will be read, and at most {@link Buffer#remaining out.remaining()} 736 * characters will be written. The buffers' positions will be advanced to 737 * reflect the bytes read and the characters written, but their marks and 738 * limits will not be modified. 739 * 740 * <p> This method returns a {@link CoderResult} object to describe its 741 * reason for termination, in the same manner as the {@link #decode decode} 742 * method. Most implementations of this method will handle decoding errors 743 * by returning an appropriate result object for interpretation by the 744 * {@link #decode decode} method. An optimized implementation may instead 745 * examine the relevant error action and implement that action itself. 746 * 747 * <p> An implementation of this method may perform arbitrary lookahead by 748 * returning {@link CoderResult#UNDERFLOW} until it receives sufficient 749 * input. </p> 750 * 751 * @param in 752 * The input byte buffer 753 * 754 * @param out 755 * The output character buffer 756 * 757 * @return A coder-result object describing the reason for termination 758 */ decodeLoop(ByteBuffer in, CharBuffer out)759 protected abstract CoderResult decodeLoop(ByteBuffer in, 760 CharBuffer out); 761 762 /** 763 * Convenience method that decodes the remaining content of a single input 764 * byte buffer into a newly-allocated character buffer. 765 * 766 * <p> This method implements an entire <a href="#steps">decoding 767 * operation</a>; that is, it resets this decoder, then it decodes the 768 * bytes in the given byte buffer, and finally it flushes this 769 * decoder. This method should therefore not be invoked if a decoding 770 * operation is already in progress. </p> 771 * 772 * @param in 773 * The input byte buffer 774 * 775 * @return A newly-allocated character buffer containing the result of the 776 * decoding operation. The buffer's position will be zero and its 777 * limit will follow the last character written. 778 * 779 * @throws IllegalStateException 780 * If a decoding operation is already in progress 781 * 782 * @throws MalformedInputException 783 * If the byte sequence starting at the input buffer's current 784 * position is not legal for this charset and the current malformed-input action 785 * is {@link CodingErrorAction#REPORT} 786 * 787 * @throws UnmappableCharacterException 788 * If the byte sequence starting at the input buffer's current 789 * position cannot be mapped to an equivalent character sequence and 790 * the current unmappable-character action is {@link 791 * CodingErrorAction#REPORT} 792 */ decode(ByteBuffer in)793 public final CharBuffer decode(ByteBuffer in) 794 throws CharacterCodingException 795 { 796 int n = (int)(in.remaining() * averageCharsPerByte()); 797 CharBuffer out = CharBuffer.allocate(n); 798 799 if ((n == 0) && (in.remaining() == 0)) 800 return out; 801 reset(); 802 for (;;) { 803 CoderResult cr = in.hasRemaining() ? 804 decode(in, out, true) : CoderResult.UNDERFLOW; 805 if (cr.isUnderflow()) 806 cr = flush(out); 807 808 if (cr.isUnderflow()) 809 break; 810 if (cr.isOverflow()) { 811 n = 2*n + 1; // Ensure progress; n might be 0! 812 CharBuffer o = CharBuffer.allocate(n); 813 out.flip(); 814 o.put(out); 815 out = o; 816 continue; 817 } 818 cr.throwException(); 819 } 820 out.flip(); 821 return out; 822 } 823 824 825 826 /** 827 * Tells whether or not this decoder implements an auto-detecting charset. 828 * 829 * <p> The default implementation of this method always returns 830 * <tt>false</tt>; it should be overridden by auto-detecting decoders to 831 * return <tt>true</tt>. </p> 832 * 833 * @return <tt>true</tt> if, and only if, this decoder implements an 834 * auto-detecting charset 835 */ isAutoDetecting()836 public boolean isAutoDetecting() { 837 return false; 838 } 839 840 /** 841 * Tells whether or not this decoder has yet detected a 842 * charset <i>(optional operation)</i>. 843 * 844 * <p> If this decoder implements an auto-detecting charset then at a 845 * single point during a decoding operation this method may start returning 846 * <tt>true</tt> to indicate that a specific charset has been detected in 847 * the input byte sequence. Once this occurs, the {@link #detectedCharset 848 * detectedCharset} method may be invoked to retrieve the detected charset. 849 * 850 * <p> That this method returns <tt>false</tt> does not imply that no bytes 851 * have yet been decoded. Some auto-detecting decoders are capable of 852 * decoding some, or even all, of an input byte sequence without fixing on 853 * a particular charset. 854 * 855 * <p> The default implementation of this method always throws an {@link 856 * UnsupportedOperationException}; it should be overridden by 857 * auto-detecting decoders to return <tt>true</tt> once the input charset 858 * has been determined. </p> 859 * 860 * @return <tt>true</tt> if, and only if, this decoder has detected a 861 * specific charset 862 * 863 * @throws UnsupportedOperationException 864 * If this decoder does not implement an auto-detecting charset 865 */ isCharsetDetected()866 public boolean isCharsetDetected() { 867 throw new UnsupportedOperationException(); 868 } 869 870 /** 871 * Retrieves the charset that was detected by this 872 * decoder <i>(optional operation)</i>. 873 * 874 * <p> If this decoder implements an auto-detecting charset then this 875 * method returns the actual charset once it has been detected. After that 876 * point, this method returns the same value for the duration of the 877 * current decoding operation. If not enough input bytes have yet been 878 * read to determine the actual charset then this method throws an {@link 879 * IllegalStateException}. 880 * 881 * <p> The default implementation of this method always throws an {@link 882 * UnsupportedOperationException}; it should be overridden by 883 * auto-detecting decoders to return the appropriate value. </p> 884 * 885 * @return The charset detected by this auto-detecting decoder, 886 * or <tt>null</tt> if the charset has not yet been determined 887 * 888 * @throws IllegalStateException 889 * If insufficient bytes have been read to determine a charset 890 * 891 * @throws UnsupportedOperationException 892 * If this decoder does not implement an auto-detecting charset 893 */ detectedCharset()894 public Charset detectedCharset() { 895 throw new UnsupportedOperationException(); 896 } 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 throwIllegalStateException(int from, int to)993 private void throwIllegalStateException(int from, int to) { 994 throw new IllegalStateException("Current state = " + stateNames[from] 995 + ", new state = " + stateNames[to]); 996 } 997 998 } 999