• Home
Name Date Size #Lines LOC

..--

Asm/07-Sep-2024-5,3194,128

C/07-Sep-2024-38,60927,234

CPP/07-Sep-2024-116,42984,072

CS/7zip/07-Sep-2024-4,5863,863

DOC/07-Sep-2024-3,4912,548

Java/SevenZip/07-Sep-2024-3,5413,077

BUILD.gnD07-Sep-20243.7 KiB167144

LICENSED07-Sep-2024464 118

NOTICED07-Sep-2024464 118

OAT.xmlD07-Sep-202432.3 KiB520497

README.OpenSourceD07-Sep-2024309 1211

README.mdD07-Sep-202418.6 KiB548412

README_zh.mdD07-Sep-202417.3 KiB501371

bundle.jsonD07-Sep-20241.2 KiB4747

lzma.gniD07-Sep-20241.3 KiB6159

README.OpenSource

1[
2  {
3    "Name": "lzma",
4    "License": "Public domain",
5    "License File": "LICENSE",
6    "Version Number": "23.01",
7    "Owner": "zangleizhen@huawei.com",
8    "Upstream URL": "https://7-zip.org/a/lzma2301.7z",
9    "Description": "LZMA is default and general compression method of 7z and xz format."
10  }
11]
12

README.md

1# third_party_lzma
2
3## Description
4
5---
6LZMA SDK provides the documentation, samples, header files,
7libraries, and tools you need to develop applications that
8use 7z / LZMA / LZMA2 / XZ compression.
9
10LZMA is an improved version of famous LZ77 compression algorithm.
11It was improved in way of maximum increasing of compression ratio,
12keeping high decompression speed and low memory requirements for
13decompressing.
14
15LZMA2 is a LZMA based compression method. LZMA2 provides better
16multithreading support for compression than LZMA and some other improvements.
17
187z is a file format for data compression and file archiving.
197z is a main file format for 7-Zip compression program (www.7-zip.org).
207z format supports different compression methods: LZMA, LZMA2 and others.
217z also supports AES-256 based encryption.
22
23XZ is a file format for data compression that uses LZMA2 compression.
24XZ format provides additional features: SHA/CRC check, filters for
25improved compression ratio, splitting to blocks and streams
26
27---
28
29## Software Architecture
30
31---
32Source code:
33| format/algorithm  | C | C++ | C# | Java |
34| :------ | :---------| :----- | :----- | :----- |
35| LZMA compression and decompression                |  ✓         | ✓      |  ✓    |  ✓    |
36| LZMA2 compression and decompression               |  ✓         | ✓      |       |        |
37| XZ compression and decompression                  |  ✓         | ✓      |       |        |
38| 7z decompression                                  | ✓          | ✓      |       |        |
39| 7z compression                                    |            | ✓      |       |        |
40| small SFXs for installers (7z decompression)      |  ✓         |         |       |        |
41| SFXs and SFXs for installers (7z decompression)   |            | ✓      |       |        |
42
43---
44Source code structure
45
46```bash
47/third_party/lzma
48├── Asm                             # asm files (optimized code for CRC calculation and Intel-AES encryption)
49│   ├── arm
50│   ├── arm64
51│   └── x86
52├── C                               # C files (compression / decompression and other)
53│   └── Util
54│       ├── 7z                      # 7z decoder program (decoding 7z files)
55│       ├── Lzma                    # LZMA program (file->file LZMA encoder/decoder)
56│       ├── LzmaLib                 # LZMA library (.DLL for Windows)
57│       └── SfxSetup                # small SFX module for installers
58├── CPP
59│   ├── Common                      # common files for C++ projects
60│   ├── Windows                     # common files for Windows related code
61│   └── 7zip                        # files related to 7-Zip
62│       ├── Archive                 # files related to archiving
63│       │   ├── Common              # common files for archive handling
64│       │   └── 7z                  # 7z C++ Encoder/Decoder
65│       ├── Bundles                 # Modules that are bundles of other modules (files)
66│       │   ├── Alone7z             # 7zr.exe: Standalone 7-Zip console program (reduced version)
67│       │   ├── Format7zExtractR    # 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
68│       │   ├── Format7zR           # 7zr.dll:  Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
69│       │   ├── LzmaCon             # lzma.exe: LZMA compression/decompression
70│       │   ├── LzmaSpec            # example code for LZMA Specification
71│       │   ├── SFXCon              # 7zCon.sfx: Console 7z SFX module
72│       │   ├── SFXSetup            # 7zS.sfx: 7z SFX module for installers
73│       │   └── SFXWin              # 7z.sfx: GUI 7z SFX module
74│       ├── Common                  # common files for 7-Zip
75│       ├── Compress                # files for compression/decompression
76│       ├── Crypto                  # files for encryption / decompression
77│       └── UI                      # User Interface files
78│           ├── Client7z            # Test application for 7za.dll, 7zr.dll, 7zxr.dll
79│           ├── Common              # Common UI files
80│           ├── Console             # Code for console program (7z.exe)
81│           ├── Explorer            # Some code from 7-Zip Shell extension
82│           ├── FileManager         # Some GUI code from 7-Zip File Manager
83│           └── GUI                 # Some GUI code from 7-Zip
84├── CS
85│   └── 7zip
86│       ├── Common                  # some common files for 7-Zip
87│       └── Compress                # files related to compression/decompression
88│           ├── LZ                  # files related to LZ (Lempel-Ziv) compression algorithm
89│           ├── LZMA                # LZMA compression/decompression
90│           ├── LzmaAlone           # file->file LZMA compression/decompression
91│           └── RangeCoder          # Range Coder (special code of compression/decompression)
92├── DOC
93│   ├── 7zC.txt                     # 7z ANSI-C Decoder description
94│   ├── 7zFormat.txt                # 7z Format description
95│   ├── installer.txt               # information about 7-Zip for installers
96│   ├── lzma-history.txt            # history of LZMA SDK
97│   ├── lzma-sdk.txt                # LZMA SDK description
98│   ├── lzma-specification.txt      # Specification of LZMA
99│   ├── lzma.txt                    # LZMA compression description
100│   └── Methods.txt                 # Compression method IDs for .7z
101└── Java
102    └── SevenZip
103        └── Compression             # files related to compression/decompression
104            ├── LZ                  # files related to LZ (Lempel-Ziv) compression algorithm
105            ├── LZMA                # LZMA compression/decompression
106            └── RangeCoder          # Range Coder (special code of compression/decompression)
107```
108
109---
110
111## NOTICES / LICENSE
112
113LZMA SDK is written and placed in the public domain by Igor Pavlov.
114
115Some code in LZMA SDK is based on public domain code from another developers:
116
117  1) PPMd var.H (2001): Dmitry Shkarin
118  2) SHA-256: Wei Dai (Crypto++ library)
119
120Anyone is free to copy, modify, publish, use, compile, sell, or distribute the
121original LZMA SDK code, either in source code form or as a compiled binary, for
122any purpose, commercial or non-commercial, and by any means.
123
124LZMA SDK code is compatible with open source licenses, for example, you can
125include it to GNU GPL or GNU LGPL code.
126
127## Build
128
129### ***UNIX/Linux version***
130
131There are several options to compile 7-Zip with different compilers: gcc and clang.
132Also 7-Zip code contains two versions for some critical parts of code: in C and in Assembler.
133So if you compile the version with Assembler code, you will get faster 7-Zip binary.
134
1357-Zip's assembler code uses the following syntax for different platforms:
136
137#### *arm64: GNU assembler for ARM64 with preprocessor*
138
139That systax of that arm64 assembler code in 7-Zip is supported by GCC and CLANG for ARM64.
140
141#### *x86 and x86_64(AMD64)*
142
143There are 2 programs that supports MASM syntax in Linux.
144Asmc Macro Assembler and JWasm. But JWasm now doesn't support some cpu instructions used in 7-Zip.
145So you must install Asmc Macro Assembler in Linux, if you want to compile fastest version of 7-Zip  x86 and x86-64: [https://github.com/nidud/asmc](https://github.com/nidud/asmc)
146
147### ***Building commands***
148
149There are different binaries that can be compiled from 7-Zip source.
150There are 2 main files in folder for compiling:
151  makefile        - that can be used for compiling Windows version of 7-Zip with nmake command
152  makefile.gcc    - that can be used for compiling Linux/macOS versions of 7-Zip with make command
153
154At first you must change the current folder to folder that contains `makefile.gcc`:
155
156```bash
157    cd CPP/7zip/Bundles/Alone7z
158```
159
160Then you can compile `makefile.gcc` with the command:
161
162```bash
163    make -j -f makefile.gcc
164```
165
166Also there are additional "*.mak" files in folder "CPP/7zip/" that can be used to compile
1677-Zip binaries with optimized code and optimzing options.
168
169To compile with GCC without assembler:
170
171```bash
172  cd CPP/7zip/Bundles/Alone7z
173  make -j -f ../../cmpl_gcc.mak
174```
175
176Also you can change some compiler options in the mak files:
177  cmpl_gcc.mak
178  var_gcc.mak
179  warn_gcc.mak
180
181## Interface Usage
182
183This section describes LZMA encoding and decoding functions written in C language.
184
185Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK)
186
187Also you can look source code for LZMA encoding and decoding:
188
189  ***C/Util/Lzma/LzmaUtil.c***
190
191### ***LZMA compressed file format***
192
193```bash
194Offset Size Description
195  0     1   Special LZMA properties (lc,lp, pb in encoded form)
196  1     4   Dictionary size (little endian)
197  5     8   Uncompressed size (little endian). -1 means unknown size
198 13         Compressed data
199```
200
201ANSI-C LZMA Decoder
202
203Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
204If you want to use old interfaces you can download previous version of LZMA SDK
205from sourceforge.net site.
206
207To use ANSI-C LZMA Decoder you need the following files:
208
209```bash
210  LzmaDec.h
211  LzmaDec.c
212  7zTypes.h
213  Precomp.h
214  Compiler.h
215```
216
217Look example code:
218  C/Util/Lzma/LzmaUtil.c
219
220Memory requirements for LZMA decoding
221
2221. Stack usage of LZMA decoding function for local variables is not larger than 200-400 bytes.
2232. LZMA Decoder uses dictionary buffer and internal state structure.
2243. Internal state structure consumes state_size = (4 + (1.5 << (lc + lp))) KB by default (lc=3, lp=0), state_size = 16 KB.
225
226### ***How To decompress data***
227
228LZMA Decoder (ANSI-C version) now supports 2 interfaces:
229
230**1)** Single-call Decompressing
231
232**2)** Multi-call State Decompressing (zlib-like interface)
233
234**You must use external allocator:**
235
236Example:
237
238```c
239void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
240void SzFree(void *p, void *address) { p = p; free(address); }
241ISzAlloc alloc = { SzAlloc, SzFree };
242```
243
244You can use p = p; operator to disable compiler warnings.
245
246#### ***Single-call Decompressing***
247
2481. When to use: RAM->RAM decompressing
2492. Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h
2503. Compile defines: no defines
2514. Memory Requirements:
252
253- Input buffer: compressed size
254- Output buffer: uncompressed size
255- LZMA Internal Structures: state_size (16 KB for default settings)
256
257**Interface:**
258
259```c
260  int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
261      const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
262      ELzmaStatus *status, ISzAlloc *alloc);
263  In:
264    dest     - output data
265    destLen  - output data size
266    src      - input data
267    srcLen   - input data size
268    propData - LZMA properties  (5 bytes)
269    propSize - size of propData buffer (5 bytes)
270    finishMode - It has meaning only if the decoding reaches output limit (*destLen).
271         LZMA_FINISH_ANY - Decode just destLen bytes.
272         LZMA_FINISH_END - Stream must be finished after (*destLen).
273                           You can use LZMA_FINISH_END, when you know that
274                           current output buffer covers last bytes of stream.
275    alloc    - Memory allocator.
276
277  Out:
278    destLen  - processed output size
279    srcLen   - processed input size
280
281  Output:
282    SZ_OK
283      status:
284        LZMA_STATUS_FINISHED_WITH_MARK
285        LZMA_STATUS_NOT_FINISHED
286        LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
287    SZ_ERROR_DATA - Data error
288    SZ_ERROR_MEM  - Memory allocation error
289    SZ_ERROR_UNSUPPORTED - Unsupported properties
290    SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
291```
292
293  If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
294  and output value of destLen will be less than output buffer size limit.
295
296  You can use multiple checks to test data integrity after full decompression:
297
298   1. Check Result and "status" variable.
299   2. Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
300   3. Check that output(srcLen) = compressedSize, if you know real compressedSize.
301       You must use correct finish mode in that case.
302
303#### ***Multi-call State Decompressing (zlib-like interface)***
304
3051. When to use: file->file decompressing
3062. Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h
3073. Memory Requirements:
308
309- Buffer for input stream: any size (for example, 16 KB)
310- Buffer for output stream: any size (for example, 16 KB)
311- LZMA Internal Structures: state_size (16 KB for default settings)
312- LZMA dictionary (dictionary size is encoded in LZMA properties header)
313
314**1)** read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
315
316```c
317   unsigned char header[LZMA_PROPS_SIZE + 8];
318   ReadFile(inFile, header, sizeof(header)
319```
320
321**2)** Allocate CLzmaDec structures (state + dictionary) using LZMA properties
322
323```c
324  CLzmaDec state;
325  LzmaDec_Constr(&state);
326  res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
327  if (res != SZ_OK)
328    return res;
329```
330
331**3)** Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
332
333```c
334  LzmaDec_Init(&state);
335  for (;;)
336  {
337    ...
338    int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
339        const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
340    ...
341  }
342```
343
344**4)** Free all allocated structures
345
346```c
347  LzmaDec_Free(&state, &g_Alloc);
348```
349
350Look example code:
351  C/Util/Lzma/LzmaUtil.c
352
353### ***How To compress data***
354
3551 Compile files:
356
357```bash
358  7zTypes.h
359  Threads.h
360  LzmaEnc.h
361  LzmaEnc.c
362  LzFind.h
363  LzFind.c
364  LzFindMt.h
365  LzFindMt.c
366  LzHash.h
367```
368
3692 Memory Requirements:
370
371- (dictSize * 11.5 + 6 MB) + state_size
372
3733 Lzma Encoder can use two memory allocators:
374
375- alloc - for small arrays.
376- allocBig - for big arrays.
377
378For example, you can use Large RAM Pages (2 MB) in allocBig allocator for better compression speed. Note that Windows has bad implementation for Large RAM Pages.
379It's OK to use same allocator for alloc and allocBig.
380
381#### ***Single-call Compression with callbacks***
382
383Look example code:
384  C/Util/Lzma/LzmaUtil.c
385
386When to use: file->file compressing
387
388**1)** you must implement callback structures for interfaces:
389
390```c
391ISeqInStream
392ISeqOutStream
393ICompressProgress
394ISzAlloc
395
396static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
397static void SzFree(void *p, void *address) {  p = p; MyFree(address); }
398static ISzAlloc g_Alloc = { SzAlloc, SzFree };
399
400  CFileSeqInStream inStream;
401  CFileSeqOutStream outStream;
402
403  inStream.funcTable.Read = MyRead;
404  inStream.file = inFile;
405  outStream.funcTable.Write = MyWrite;
406  outStream.file = outFile;
407```
408
409**2)** Create CLzmaEncHandle object;
410
411```c
412  CLzmaEncHandle enc;
413
414  enc = LzmaEnc_Create(&g_Alloc);
415  if (enc == 0)
416    return SZ_ERROR_MEM;
417```
418
419**3)** initialize CLzmaEncProps properties;
420
421```c
422  LzmaEncProps_Init(&props);
423```
424
425  Then you can change some properties in that structure.
426
427**4)** Send LZMA properties to LZMA Encoder
428
429```c
430  res = LzmaEnc_SetProps(enc, &props);
431```
432
433**5)** Write encoded properties to header
434
435```c
436    Byte header[LZMA_PROPS_SIZE + 8];
437    size_t headerSize = LZMA_PROPS_SIZE;
438    UInt64 fileSize;
439    int i;
440
441    res = LzmaEnc_WriteProperties(enc, header, &headerSize);
442    fileSize = MyGetFileLength(inFile);
443    for (i = 0; i < 8; i++)
444      header[headerSize++] = (Byte)(fileSize >> (8 * i));
445    MyWriteFileAndCheck(outFile, header, headerSize)
446```
447
448**6)** Call encoding function:
449
450```c
451      res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
452        NULL, &g_Alloc, &g_Alloc);
453```
454
455**7)** Destroy LZMA Encoder Object
456
457```c
458  LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
459```
460
461If callback function return some error code, LzmaEnc_Encode also returns that code
462or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
463
464---
465
466#### ***Single-call RAM->RAM Compression***
467
468Single-call RAM->RAM Compression is similar to Compression with callbacks,
469but you provide pointers to buffers instead of pointers to stream callbacks:
470
471```c
472SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
473    const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
474    ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
475Return code:
476  SZ_OK               - OK
477  SZ_ERROR_MEM        - Memory allocation error
478  SZ_ERROR_PARAM      - Incorrect paramater
479  SZ_ERROR_OUTPUT_EOF - output buffer overflow
480  SZ_ERROR_THREAD     - errors in multithreading functions (only for Mt version)
481```
482
483Defines
484
485```bash
486_LZMA_SIZE_OPT          - Enable some optimizations in LZMA Decoder to get smaller executable code.
487_LZMA_PROB32            - It can increase the speed on some 32-bit CPUs, but memory usage for
488                        - some structures will be doubled in that case.
489_LZMA_UINT32_IS_ULONG   - Define it if int is 16-bit on your compiler and long is 32-bit.
490_LZMA_NO_SYSTEM_SIZE_T  - Define it if you don't want to use size_t type.
491_7ZIP_PPMD_SUPPPORT     - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
492```
493
494C++ LZMA Encoder/Decoder
495
496C++ LZMA code use COM-like interfaces. So if you want to use it, you can study basics of COM/OLE.
497
498C++ LZMA code is just wrapper over ANSI-C code.
499
500C++ Notes
501
502If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
503you must check that you correctly work with "new" operator.
504
5057-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
506So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
507
508```cpp
509operator new(size_t size)
510{
511  void *p = ::malloc(size);
512  if (p == 0)
513    throw CNewException();
514  return p;
515}
516```
517
518If you use MSCV that throws exception for "new" operator, you can compile without
519"NewHandler.cpp". So standard exception will be used. Actually some code of
5207-Zip catches any exception in internal code and converts it to HRESULT code.
521So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
522
523### ***Interface Examples:***
524
525Look example code : C/Util/Lzma/LzmaUtil.c
526
527```bash
528    cd C/Util/Lzma
529    make -j -f makefile.gcc
530    output: ./_o/7lzma
531```
532
533```bash
534    LZMA-C 22.01 (x64) : Igor Pavlov : Public domain : 2022-07-15
535
536    Usage:  lzma <e|d> inputFile outputFile
537    e: encode file
538    d: decode file
539```
540
541## Contribution
542
543[https://sourceforge.net/p/sevenzip/_list/tickets](https://sourceforge.net/p/sevenzip/_list/tickets)
544
545## Repositories Involved
546
547[**developtools\hiperf**](https://gitee.com/openharmony/developtools_hiperf)
548

README_zh.md

1# third_party_lzma
2
3## 介绍
4
5LZMA 是著名的LZ77压缩算法的改良版本, 最大化地提高了压缩比率, 保持了高压缩速度和解压缩时较低的内存需要。
6
7LZMA2 基于 LZMA, 在压缩过程中提供了更好的多线程支持, 和其他改进优化。
8
97z 是一种数据压缩和文件档案的格式, 是7zip软件的主要文件格式 [**7z官网**](https://www.7-zip.org)107z 格式支持不同的压缩方式: LZMA, LZMA2 和其他, 同时也支持基于AES-256的对称加密。
11
12XZ 是一种使用LZMA2数据压缩的文件格式, XZ格式带有额外的特性: SHA/CRC数据校验, 用于提升压缩比率的filters, 拆分blocks和streams。
13
14## 软件架构
15
16软件架构说明
17
18| format/algorithm  | C | C++ | C# | Java |
19| :------ | :---------| :----- | :----- | :----- |
20| LZMA  压缩和解压缩                                 |  ✓         | ✓      |  ✓    |  ✓    |
21| LZMA2 压缩和解压缩                                 |  ✓         | ✓      |       |        |
22| XZ 压缩和解压缩                                    |  ✓         | ✓      |       |        |
23| 7Z 解压缩                                          | ✓          | ✓      |       |        |
24| 7Z 压缩                                            |            | ✓      |       |        |
25| small SFXs for installers (7z decompression)      |  ✓         |         |       |        |
26| SFXs and SFXs for installers (7z decompression)   |             | ✓      |       |        |
27
28---
29
30```bash
31/third_party/lzma
32├── Asm                             # asm files (optimized code for CRC calculation and Intel-AES encryption)
33│   ├── arm
34│   ├── arm64
35│   └── x86
36├── C                               # C files (compression / decompression and other)
37│   └── Util
38│       ├── 7z                      # 7z decoder program (decoding 7z files)
39│       ├── Lzma                    # LZMA program (file->file LZMA encoder/decoder)
40│       ├── LzmaLib                 # LZMA library (.DLL for Windows)
41│       └── SfxSetup                # small SFX module for installers
42├── CPP
43│   ├── Common                      # common files for C++ projects
44│   ├── Windows                     # common files for Windows related code
45│   └── 7zip                        # files related to 7-Zip
46│       ├── Archive                 # files related to archiving
47│       │   ├── Common              # common files for archive handling
48│       │   └── 7z                  # 7z C++ Encoder/Decoder
49│       ├── Bundles                 # Modules that are bundles of other modules (files)
50│       │   ├── Alone7z             # 7zr.exe: Standalone 7-Zip console program (reduced version)
51│       │   ├── Format7zExtractR    # 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
52│       │   ├── Format7zR           # 7zr.dll:  Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
53│       │   ├── LzmaCon             # lzma.exe: LZMA compression/decompression
54│       │   ├── LzmaSpec            # example code for LZMA Specification
55│       │   ├── SFXCon              # 7zCon.sfx: Console 7z SFX module
56│       │   ├── SFXSetup            # 7zS.sfx: 7z SFX module for installers
57│       │   └── SFXWin              # 7z.sfx: GUI 7z SFX module
58│       ├── Common                  # common files for 7-Zip
59│       ├── Compress                # files for compression/decompression
60│       ├── Crypto                  # files for encryption / decompression
61│       └── UI                      # User Interface files
62│           ├── Client7z            # Test application for 7za.dll, 7zr.dll, 7zxr.dll
63│           ├── Common              # Common UI files
64│           ├── Console             # Code for console program (7z.exe)
65│           ├── Explorer            # Some code from 7-Zip Shell extension
66│           ├── FileManager         # Some GUI code from 7-Zip File Manager
67│           └── GUI                 # Some GUI code from 7-Zip
68├── CS
69│   └── 7zip
70│       ├── Common                  # some common files for 7-Zip
71│       └── Compress                # files related to compression/decompression
72│           ├── LZ                  # files related to LZ (Lempel-Ziv) compression algorithm
73│           ├── LZMA                # LZMA compression/decompression
74│           ├── LzmaAlone           # file->file LZMA compression/decompression
75│           └── RangeCoder          # Range Coder (special code of compression/decompression)
76├── DOC
77│   ├── 7zC.txt                     # 7z ANSI-C Decoder description
78│   ├── 7zFormat.txt                # 7z Format description
79│   ├── installer.txt               # information about 7-Zip for installers
80│   ├── lzma-history.txt            # history of LZMA SDK
81│   ├── lzma-sdk.txt                # LZMA SDK description
82│   ├── lzma-specification.txt      # Specification of LZMA
83│   ├── lzma.txt                    # LZMA compression description
84│   └── Methods.txt                 # Compression method IDs for .7z
85└── Java
86    └── SevenZip
87        └── Compression             # files related to compression/decompression
88            ├── LZ                  # files related to LZ (Lempel-Ziv) compression algorithm
89            ├── LZMA                # LZMA compression/decompression
90            └── RangeCoder          # Range Coder (special code of compression/decompression)
91```
92
93## 证书
94
95LZMA SDK is written and placed in the public domain by Igor Pavlov.
96
97Some code in LZMA SDK is based on public domain code from another developers:
98
99  1) PPMd var.H (2001): Dmitry Shkarin
100
101  2) SHA-256: Wei Dai (Crypto++ library)
102
103Anyone is free to copy, modify, publish, use, compile, sell, or distribute the
104original LZMA SDK code, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.
105
106LZMA SDK code is compatible with open source licenses, for example, you can include it to GNU GPL or GNU LGPL code.
107
108## 编译构建
109
110### ***UNIX/Linux***
111
112使用gcc和clang编译7-zip有多种选项,同时7-zip代码中两部分重要的代码: C和汇编。如果与汇编代码一起编译版本,会得到更快的7-zip二进制。7-zip的汇编代码遵循不同平台的语法。
113
114#### *arm64*
115
116gcc和clang arm64版本支持arm64汇编代码语法。
117
118#### *x86 and x86_64(AMD64)*
119
120Asmc Macro Assembler 和 JWasm 在Linux 系统上都支持MASM语法,但JWasm 不支持一些7-zip中使用的cpu指令。
121如果你想编译更快的7zip,必须在Linux上安装Asmc Macro Assembler [https://github.com/nidud/asmc](https://github.com/nidud/asmc)
122
123### ***构建命令***
124
125目录中有两个主要文件用于编译
126  makefile        - 使用nmake命令编译Windows版本的7zip
127  makefile.gcc    - 使用make命令编译Linux/macOs版本的7zip
128
129首先切换到包含 `makefile.gcc`的目录下:
130
131```bash
132    cd CPP/7zip/Bundles/Alone7z
133```
134
135```bash
136    make -j -f makefile.gcc
137```
138
139另外在"CPP/7zip/"目录下的"*.mak"文件也可以与优化的代码同时编译,并且带有优化选项。比如:
140
141```bash
142  cd CPP/7zip/Bundles/Alone7z
143  make -j -f ../../cmpl_gcc.mak
144```
145
146## **接口使用说明**
147
148这部分描述了C语言实现的LZMA编码和解码函数
149
150注意: 你也可以阅读参考 LZMA Specification (lzma-specification.txt from LZMA SDK)
151
152你也可以查看使用LZMA编码和解码的案例:
153  ***C/Util/Lzma/LzmaUtil.c***
154
155### ***LZMA 压缩的文件格式***
156
157```bash
158Offset Size Description
159  0     1   Special LZMA properties (lc,lp, pb in encoded form)
160  1     4   Dictionary size (little endian)
161  5     8   Uncompressed size (little endian). -1 means unknown size
162 13         Compressed data
163```
164
165ANSI-C(American National Standards Institue) LZMA Decoder
166请注意ANSI-C的接口在LZMA SDK 4.58版本发生了变更,如果你想使用旧的接口,你可以从sourceforge.net 网站下载之前的LZMA SDK版本。
167
168使用 ANSI-C LZMA Decoder需要使用到以下文件:
169
170```bash
171  LzmaDec.h
172  LzmaDec.c
173  7zTypes.h
174  Precomp.h
175  Compiler.h
176```
177
178参考案例: C/Util/Lzma/LzmaUtil.c
179
180LZMA decoding的内存要求
181
1821. LZMA decoding函数局部变量的栈内存不超过200-400字节
183
1842. LZMA Decoder使用字典缓冲区和内部state结构
185
1863. 内部state结构size消耗state_size = (4 + (1.5 << (lc + lp))) KB by default (lc=3, lp=0), state_size = 16 KB.
187
188### ***如何解压缩***
189
190LZMA Decoder (ANSI-C version) 支持以下两种接口:
191
192**1)** 单次调用: LzmaDecode
193
194**2)** 多次调用:LzmaDec_DecodeToBuf(类似于zlib接口)
195
196**你必须自己定义内存分配器:**
197
198Example:
199
200```c
201void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
202void SzFree(void *p, void *address) { p = p; free(address); }
203ISzAlloc alloc = { SzAlloc, SzFree };
204```
205
206You can use p = p; operator to disable compiler warnings.
207
208#### ***单次调用***
209
2101. 使用场景: RAM->RAM decompressing
2112. 编译文件: LzmaDec.h + LzmaDec.c + 7zTypes.h
2123. 编译宏: 不需要
2134. 内存需要:
214
215- Input buffer: compressed size
216- Output buffer: uncompressed size
217- LZMA Internal Structures: state_size (16 KB for default settings)
218
219**Interface:**
220
221```c
222  int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
223      const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
224      ELzmaStatus *status, ISzAlloc *alloc);
225  In:
226    dest     - output data
227    destLen  - output data size
228    src      - input data
229    srcLen   - input data size
230    propData - LZMA properties  (5 bytes)
231    propSize - size of propData buffer (5 bytes)
232    finishMode - It has meaning only if the decoding reaches output limit (*destLen).
233         LZMA_FINISH_ANY - Decode just destLen bytes.
234         LZMA_FINISH_END - Stream must be finished after (*destLen).
235                           You can use LZMA_FINISH_END, when you know that
236                           current output buffer covers last bytes of stream.
237    alloc    - Memory allocator.
238
239  Out:
240    destLen  - processed output size
241    srcLen   - processed input size
242
243  Output:
244    SZ_OK
245      status:
246        LZMA_STATUS_FINISHED_WITH_MARK
247        LZMA_STATUS_NOT_FINISHED
248        LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
249    SZ_ERROR_DATA - Data error
250    SZ_ERROR_MEM  - Memory allocation error
251    SZ_ERROR_UNSUPPORTED - Unsupported properties
252    SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
253```
254
255如果LZMA decoder 在输出缓冲区上限前到达并看到了end_marker, 返回OK,同时输出的destLen的值会比输出缓冲区的上限小。
256
257你可以在完全解压缩后使用多重检查数据的完整性:
258
259   1. 检查返回值和status变量
260   2. 如果你已知未压缩的数据大小,检查 output(destLen) = uncompressedSize
261   3. 如果你已知压缩后的数据大小,检查 output(srcLen) = compressedSize
262
263#### ***根据状态多次调用 (类似于zlib接口)***
264
2651. 使用场景: file->file decompressing
2662. 编译文件: LzmaDec.h + LzmaDec.c + 7zTypes.h
2673. 内存要求:
268
269- Buffer for input stream: any size (for example, 16 KB)
270- Buffer for output stream: any size (for example, 16 KB)
271- LZMA Internal Structures: state_size (16 KB for default settings)
272- LZMA dictionary (字典大小编码在LZMA properties header中)
273
274使用流程:
275
276**1)** 读取 LZMA properties (5 bytes) and uncompressed size (8 bytes, 小端序) 到 header:
277
278```c
279   unsigned char header[LZMA_PROPS_SIZE + 8];
280   ReadFile(inFile, header, sizeof(header)
281```
282
283**2)** 使用"LZMA properties"分配创建 CLzmaDec(state + dictionary)
284
285```c
286  CLzmaDec state;
287  LzmaDec_Constr(&state);
288  res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
289  if (res != SZ_OK)
290    return res;
291```
292
293**3)** 初始化LzmaDec,在循环中调用LzmaDec_DecodeToBuf
294
295```c
296  LzmaDec_Init(&state);
297  for (;;)
298  {
299    ...
300    int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
301        const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
302    ...
303  }
304```
305
306**4)** 释放所有分配的结构
307
308```c
309  LzmaDec_Free(&state, &g_Alloc);
310```
311
312Look example code:
313  C/Util/Lzma/LzmaUtil.c
314
315### ***如何压缩数据***
316
3171 编译文件:
318
319```bash
320  7zTypes.h
321  Threads.h
322  LzmaEnc.h
323  LzmaEnc.c
324  LzFind.h
325  LzFind.c
326  LzFindMt.h
327  LzFindMt.c
328  LzHash.h
329```
330
3312 内存需要:
332
333- (dictSize * 11.5 + 6 MB) + state_size
334
335Lzma Encoder 可使用两种内存分配器:
336
337- alloc - for small arrays.
338- allocBig - for big arrays.
339
340例如,你可以在allocBig分配器中使用大RAM页(2 MB)来获得更快的压缩速度。需要注意的是Windows对于大RAM页的实现较差。alloc和allocBig也可以使用相同的分配器。
341
342#### ***带有回调的单次压缩***
343
344Look example code:
345  C/Util/Lzma/LzmaUtil.c
346
347使用场景: file->file compressing
348
349**1)** 你必须实现接口的回调函数
350
351```c
352ISeqInStream
353ISeqOutStream
354ICompressProgress
355ISzAlloc
356
357static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
358static void SzFree(void *p, void *address) {  p = p; MyFree(address); }
359static ISzAlloc g_Alloc = { SzAlloc, SzFree };
360
361  CFileSeqInStream inStream;
362  CFileSeqOutStream outStream;
363
364  inStream.funcTable.Read = MyRead;
365  inStream.file = inFile;
366  outStream.funcTable.Write = MyWrite;
367  outStream.file = outFile;
368```
369
370**2)** 创建CLzmaEncHandle对象
371
372```c
373  CLzmaEncHandle enc;
374
375  enc = LzmaEnc_Create(&g_Alloc);
376  if (enc == 0)
377    return SZ_ERROR_MEM;
378```
379
380**3)** 初始化CLzmaEncProps属性
381
382```c
383  LzmaEncProps_Init(&props);
384```
385
386之后你可以改变这个结构里的一些属性
387
388**4)** 把上一个步骤设置的属性设置给LZMA Encoder
389
390```c
391  res = LzmaEnc_SetProps(enc, &props);
392```
393
394**5)** 将编码的属性写入header
395
396```c
397    Byte header[LZMA_PROPS_SIZE + 8];
398    size_t headerSize = LZMA_PROPS_SIZE;
399    UInt64 fileSize;
400    int i;
401
402    res = LzmaEnc_WriteProperties(enc, header, &headerSize);
403    fileSize = MyGetFileLength(inFile);
404    for (i = 0; i < 8; i++)
405      header[headerSize++] = (Byte)(fileSize >> (8 * i));
406    MyWriteFileAndCheck(outFile, header, headerSize)
407```
408
409**6)** 调用编码函数
410
411```c
412      res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
413        NULL, &g_Alloc, &g_Alloc);
414```
415
416**7)** 删除LZMA Encoder对象
417
418```c
419  LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
420```
421
422如果回调函数返回某些错误码,LzmaEnc_Encode 也会返回该错误码或者返回类似于SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS。
423
424---
425
426#### ***单次调用 RAM->RAM 压缩***
427
428单次调用,RAM->RAM 压缩与设置回调的方式压缩类似, 但你需要提供指向buffers的指针而不是指向回调函数的指针。
429
430```c
431SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
432    const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
433    ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
434Return code:
435  SZ_OK               - OK
436  SZ_ERROR_MEM        - Memory allocation error
437  SZ_ERROR_PARAM      - Incorrect paramater
438  SZ_ERROR_OUTPUT_EOF - output buffer overflow
439  SZ_ERROR_THREAD     - errors in multithreading functions (only for Mt version)
440```
441
442443
444```c
445_LZMA_SIZE_OPT          - Enable some optimizations in LZMA Decoder to get smaller executable code.
446_LZMA_PROB32            - It can increase the speed on some 32-bit CPUs, but memory usage for
447                        - some structures will be doubled in that case.
448_LZMA_UINT32_IS_ULONG   - Define it if int is 16-bit on your compiler and long is 32-bit.
449_LZMA_NO_SYSTEM_SIZE_T  - Define it if you don't want to use size_t type.
450_7ZIP_PPMD_SUPPPORT     - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
451```
452
453C++版本的 LZMA Encoder/Decoder
454
455C++版本的 LZMA 代码使用COM-LIKE接口。如果你想使用,可以了解下COM(Component Object Model)/OLE(Object Linking and Embedding)/DDE(Dynamic Data Exchange)的基础。
456
457C++版本的 LZMA 代码部门仅仅只是将ANSI-C代码包装了.
458
459注意:
460如果你使用7zip目录下的C++代码,你必须检查你正确地使用new 运算符
461MSVC 6.0 编译7-zip时,不会抛出 new 运算符的异常。所以7zip在 CPP\Common\NewHandler.cpp 重新定义了new operator
462
463```cpp
464operator new(size_t size)
465{
466  void *p = ::malloc(size);
467  if (p == 0)
468    throw CNewException();
469  return p;
470}
471```
472
473如果你使用的MSCV版本支持new运算符的异常抛出,你在编译7zip时可以忽略"NewHandler.cpp"。
474所以使用标准的异常。实际上7zip的部分代码捕获的任何异常都会转换为HRESULT码。如果你调用7zip的COM interface 就不需要捕获CNewException.
475
476### ***接口案例:***
477
478Look example code : C/Util/Lzma/LzmaUtil.c
479
480```bash
481    cd C/Util/Lzma
482    make -j -f makefile.gcc
483    output: ./_o/7lzma
484```
485
486```bash
487    LZMA-C 22.01 (x64) : Igor Pavlov : Public domain : 2022-07-15
488
489    Usage:  lzma <e|d> inputFile outputFile
490    e: encode file
491    d: decode file
492```
493
494## 参与贡献
495
496[https://sourceforge.net/p/sevenzip/_list/tickets](https://sourceforge.net/p/sevenzip/_list/tickets)
497
498## 相关仓
499
500[**developtools\hiperf**](https://gitee.com/openharmony/developtools_hiperf)
501