|
Name |
|
Date |
Size |
#Lines |
LOC |
| .. | | - | - |
| Asm/ | | 07-Sep-2024 | - | 5,319 | 4,128 |
| C/ | | 07-Sep-2024 | - | 38,609 | 27,234 |
| CPP/ | | 07-Sep-2024 | - | 116,429 | 84,072 |
| CS/7zip/ | | 07-Sep-2024 | - | 4,586 | 3,863 |
| DOC/ | | 07-Sep-2024 | - | 3,491 | 2,548 |
| Java/SevenZip/ | | 07-Sep-2024 | - | 3,541 | 3,077 |
| BUILD.gn | D | 07-Sep-2024 | 3.7 KiB | 167 | 144 |
| LICENSE | D | 07-Sep-2024 | 464 | 11 | 8 |
| NOTICE | D | 07-Sep-2024 | 464 | 11 | 8 |
| OAT.xml | D | 07-Sep-2024 | 32.3 KiB | 520 | 497 |
| README.OpenSource | D | 07-Sep-2024 | 309 | 12 | 11 |
| README.md | D | 07-Sep-2024 | 18.6 KiB | 548 | 412 |
| README_zh.md | D | 07-Sep-2024 | 17.3 KiB | 501 | 371 |
| bundle.json | D | 07-Sep-2024 | 1.2 KiB | 47 | 47 |
| lzma.gni | D | 07-Sep-2024 | 1.3 KiB | 61 | 59 |
README.OpenSource
1[
2 {
3 "Name": "lzma",
4 "License": "Public domain",
5 "License File": "LICENSE",
6 "Version Number": "23.01",
7 "Owner": "zangleizhen@huawei.com",
8 "Upstream URL": "https://7-zip.org/a/lzma2301.7z",
9 "Description": "LZMA is default and general compression method of 7z and xz format."
10 }
11]
12
README.md
1# third_party_lzma
2
3## Description
4
5---
6LZMA SDK provides the documentation, samples, header files,
7libraries, and tools you need to develop applications that
8use 7z / LZMA / LZMA2 / XZ compression.
9
10LZMA is an improved version of famous LZ77 compression algorithm.
11It was improved in way of maximum increasing of compression ratio,
12keeping high decompression speed and low memory requirements for
13decompressing.
14
15LZMA2 is a LZMA based compression method. LZMA2 provides better
16multithreading support for compression than LZMA and some other improvements.
17
187z is a file format for data compression and file archiving.
197z is a main file format for 7-Zip compression program (www.7-zip.org).
207z format supports different compression methods: LZMA, LZMA2 and others.
217z also supports AES-256 based encryption.
22
23XZ is a file format for data compression that uses LZMA2 compression.
24XZ format provides additional features: SHA/CRC check, filters for
25improved compression ratio, splitting to blocks and streams
26
27---
28
29## Software Architecture
30
31---
32Source code:
33| format/algorithm | C | C++ | C# | Java |
34| :------ | :---------| :----- | :----- | :----- |
35| LZMA compression and decompression | ✓ | ✓ | ✓ | ✓ |
36| LZMA2 compression and decompression | ✓ | ✓ | | |
37| XZ compression and decompression | ✓ | ✓ | | |
38| 7z decompression | ✓ | ✓ | | |
39| 7z compression | | ✓ | | |
40| small SFXs for installers (7z decompression) | ✓ | | | |
41| SFXs and SFXs for installers (7z decompression) | | ✓ | | |
42
43---
44Source code structure
45
46```bash
47/third_party/lzma
48├── Asm # asm files (optimized code for CRC calculation and Intel-AES encryption)
49│ ├── arm
50│ ├── arm64
51│ └── x86
52├── C # C files (compression / decompression and other)
53│ └── Util
54│ ├── 7z # 7z decoder program (decoding 7z files)
55│ ├── Lzma # LZMA program (file->file LZMA encoder/decoder)
56│ ├── LzmaLib # LZMA library (.DLL for Windows)
57│ └── SfxSetup # small SFX module for installers
58├── CPP
59│ ├── Common # common files for C++ projects
60│ ├── Windows # common files for Windows related code
61│ └── 7zip # files related to 7-Zip
62│ ├── Archive # files related to archiving
63│ │ ├── Common # common files for archive handling
64│ │ └── 7z # 7z C++ Encoder/Decoder
65│ ├── Bundles # Modules that are bundles of other modules (files)
66│ │ ├── Alone7z # 7zr.exe: Standalone 7-Zip console program (reduced version)
67│ │ ├── Format7zExtractR # 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
68│ │ ├── Format7zR # 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
69│ │ ├── LzmaCon # lzma.exe: LZMA compression/decompression
70│ │ ├── LzmaSpec # example code for LZMA Specification
71│ │ ├── SFXCon # 7zCon.sfx: Console 7z SFX module
72│ │ ├── SFXSetup # 7zS.sfx: 7z SFX module for installers
73│ │ └── SFXWin # 7z.sfx: GUI 7z SFX module
74│ ├── Common # common files for 7-Zip
75│ ├── Compress # files for compression/decompression
76│ ├── Crypto # files for encryption / decompression
77│ └── UI # User Interface files
78│ ├── Client7z # Test application for 7za.dll, 7zr.dll, 7zxr.dll
79│ ├── Common # Common UI files
80│ ├── Console # Code for console program (7z.exe)
81│ ├── Explorer # Some code from 7-Zip Shell extension
82│ ├── FileManager # Some GUI code from 7-Zip File Manager
83│ └── GUI # Some GUI code from 7-Zip
84├── CS
85│ └── 7zip
86│ ├── Common # some common files for 7-Zip
87│ └── Compress # files related to compression/decompression
88│ ├── LZ # files related to LZ (Lempel-Ziv) compression algorithm
89│ ├── LZMA # LZMA compression/decompression
90│ ├── LzmaAlone # file->file LZMA compression/decompression
91│ └── RangeCoder # Range Coder (special code of compression/decompression)
92├── DOC
93│ ├── 7zC.txt # 7z ANSI-C Decoder description
94│ ├── 7zFormat.txt # 7z Format description
95│ ├── installer.txt # information about 7-Zip for installers
96│ ├── lzma-history.txt # history of LZMA SDK
97│ ├── lzma-sdk.txt # LZMA SDK description
98│ ├── lzma-specification.txt # Specification of LZMA
99│ ├── lzma.txt # LZMA compression description
100│ └── Methods.txt # Compression method IDs for .7z
101└── Java
102 └── SevenZip
103 └── Compression # files related to compression/decompression
104 ├── LZ # files related to LZ (Lempel-Ziv) compression algorithm
105 ├── LZMA # LZMA compression/decompression
106 └── RangeCoder # Range Coder (special code of compression/decompression)
107```
108
109---
110
111## NOTICES / LICENSE
112
113LZMA SDK is written and placed in the public domain by Igor Pavlov.
114
115Some code in LZMA SDK is based on public domain code from another developers:
116
117 1) PPMd var.H (2001): Dmitry Shkarin
118 2) SHA-256: Wei Dai (Crypto++ library)
119
120Anyone is free to copy, modify, publish, use, compile, sell, or distribute the
121original LZMA SDK code, either in source code form or as a compiled binary, for
122any purpose, commercial or non-commercial, and by any means.
123
124LZMA SDK code is compatible with open source licenses, for example, you can
125include it to GNU GPL or GNU LGPL code.
126
127## Build
128
129### ***UNIX/Linux version***
130
131There are several options to compile 7-Zip with different compilers: gcc and clang.
132Also 7-Zip code contains two versions for some critical parts of code: in C and in Assembler.
133So if you compile the version with Assembler code, you will get faster 7-Zip binary.
134
1357-Zip's assembler code uses the following syntax for different platforms:
136
137#### *arm64: GNU assembler for ARM64 with preprocessor*
138
139That systax of that arm64 assembler code in 7-Zip is supported by GCC and CLANG for ARM64.
140
141#### *x86 and x86_64(AMD64)*
142
143There are 2 programs that supports MASM syntax in Linux.
144Asmc Macro Assembler and JWasm. But JWasm now doesn't support some cpu instructions used in 7-Zip.
145So you must install Asmc Macro Assembler in Linux, if you want to compile fastest version of 7-Zip x86 and x86-64: [https://github.com/nidud/asmc](https://github.com/nidud/asmc)
146
147### ***Building commands***
148
149There are different binaries that can be compiled from 7-Zip source.
150There are 2 main files in folder for compiling:
151 makefile - that can be used for compiling Windows version of 7-Zip with nmake command
152 makefile.gcc - that can be used for compiling Linux/macOS versions of 7-Zip with make command
153
154At first you must change the current folder to folder that contains `makefile.gcc`:
155
156```bash
157 cd CPP/7zip/Bundles/Alone7z
158```
159
160Then you can compile `makefile.gcc` with the command:
161
162```bash
163 make -j -f makefile.gcc
164```
165
166Also there are additional "*.mak" files in folder "CPP/7zip/" that can be used to compile
1677-Zip binaries with optimized code and optimzing options.
168
169To compile with GCC without assembler:
170
171```bash
172 cd CPP/7zip/Bundles/Alone7z
173 make -j -f ../../cmpl_gcc.mak
174```
175
176Also you can change some compiler options in the mak files:
177 cmpl_gcc.mak
178 var_gcc.mak
179 warn_gcc.mak
180
181## Interface Usage
182
183This section describes LZMA encoding and decoding functions written in C language.
184
185Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK)
186
187Also you can look source code for LZMA encoding and decoding:
188
189 ***C/Util/Lzma/LzmaUtil.c***
190
191### ***LZMA compressed file format***
192
193```bash
194Offset Size Description
195 0 1 Special LZMA properties (lc,lp, pb in encoded form)
196 1 4 Dictionary size (little endian)
197 5 8 Uncompressed size (little endian). -1 means unknown size
198 13 Compressed data
199```
200
201ANSI-C LZMA Decoder
202
203Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
204If you want to use old interfaces you can download previous version of LZMA SDK
205from sourceforge.net site.
206
207To use ANSI-C LZMA Decoder you need the following files:
208
209```bash
210 LzmaDec.h
211 LzmaDec.c
212 7zTypes.h
213 Precomp.h
214 Compiler.h
215```
216
217Look example code:
218 C/Util/Lzma/LzmaUtil.c
219
220Memory requirements for LZMA decoding
221
2221. Stack usage of LZMA decoding function for local variables is not larger than 200-400 bytes.
2232. LZMA Decoder uses dictionary buffer and internal state structure.
2243. Internal state structure consumes state_size = (4 + (1.5 << (lc + lp))) KB by default (lc=3, lp=0), state_size = 16 KB.
225
226### ***How To decompress data***
227
228LZMA Decoder (ANSI-C version) now supports 2 interfaces:
229
230**1)** Single-call Decompressing
231
232**2)** Multi-call State Decompressing (zlib-like interface)
233
234**You must use external allocator:**
235
236Example:
237
238```c
239void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
240void SzFree(void *p, void *address) { p = p; free(address); }
241ISzAlloc alloc = { SzAlloc, SzFree };
242```
243
244You can use p = p; operator to disable compiler warnings.
245
246#### ***Single-call Decompressing***
247
2481. When to use: RAM->RAM decompressing
2492. Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h
2503. Compile defines: no defines
2514. Memory Requirements:
252
253- Input buffer: compressed size
254- Output buffer: uncompressed size
255- LZMA Internal Structures: state_size (16 KB for default settings)
256
257**Interface:**
258
259```c
260 int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
261 const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
262 ELzmaStatus *status, ISzAlloc *alloc);
263 In:
264 dest - output data
265 destLen - output data size
266 src - input data
267 srcLen - input data size
268 propData - LZMA properties (5 bytes)
269 propSize - size of propData buffer (5 bytes)
270 finishMode - It has meaning only if the decoding reaches output limit (*destLen).
271 LZMA_FINISH_ANY - Decode just destLen bytes.
272 LZMA_FINISH_END - Stream must be finished after (*destLen).
273 You can use LZMA_FINISH_END, when you know that
274 current output buffer covers last bytes of stream.
275 alloc - Memory allocator.
276
277 Out:
278 destLen - processed output size
279 srcLen - processed input size
280
281 Output:
282 SZ_OK
283 status:
284 LZMA_STATUS_FINISHED_WITH_MARK
285 LZMA_STATUS_NOT_FINISHED
286 LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
287 SZ_ERROR_DATA - Data error
288 SZ_ERROR_MEM - Memory allocation error
289 SZ_ERROR_UNSUPPORTED - Unsupported properties
290 SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
291```
292
293 If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
294 and output value of destLen will be less than output buffer size limit.
295
296 You can use multiple checks to test data integrity after full decompression:
297
298 1. Check Result and "status" variable.
299 2. Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
300 3. Check that output(srcLen) = compressedSize, if you know real compressedSize.
301 You must use correct finish mode in that case.
302
303#### ***Multi-call State Decompressing (zlib-like interface)***
304
3051. When to use: file->file decompressing
3062. Compile files: LzmaDec.h + LzmaDec.c + 7zTypes.h
3073. Memory Requirements:
308
309- Buffer for input stream: any size (for example, 16 KB)
310- Buffer for output stream: any size (for example, 16 KB)
311- LZMA Internal Structures: state_size (16 KB for default settings)
312- LZMA dictionary (dictionary size is encoded in LZMA properties header)
313
314**1)** read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
315
316```c
317 unsigned char header[LZMA_PROPS_SIZE + 8];
318 ReadFile(inFile, header, sizeof(header)
319```
320
321**2)** Allocate CLzmaDec structures (state + dictionary) using LZMA properties
322
323```c
324 CLzmaDec state;
325 LzmaDec_Constr(&state);
326 res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
327 if (res != SZ_OK)
328 return res;
329```
330
331**3)** Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
332
333```c
334 LzmaDec_Init(&state);
335 for (;;)
336 {
337 ...
338 int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
339 const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
340 ...
341 }
342```
343
344**4)** Free all allocated structures
345
346```c
347 LzmaDec_Free(&state, &g_Alloc);
348```
349
350Look example code:
351 C/Util/Lzma/LzmaUtil.c
352
353### ***How To compress data***
354
3551 Compile files:
356
357```bash
358 7zTypes.h
359 Threads.h
360 LzmaEnc.h
361 LzmaEnc.c
362 LzFind.h
363 LzFind.c
364 LzFindMt.h
365 LzFindMt.c
366 LzHash.h
367```
368
3692 Memory Requirements:
370
371- (dictSize * 11.5 + 6 MB) + state_size
372
3733 Lzma Encoder can use two memory allocators:
374
375- alloc - for small arrays.
376- allocBig - for big arrays.
377
378For example, you can use Large RAM Pages (2 MB) in allocBig allocator for better compression speed. Note that Windows has bad implementation for Large RAM Pages.
379It's OK to use same allocator for alloc and allocBig.
380
381#### ***Single-call Compression with callbacks***
382
383Look example code:
384 C/Util/Lzma/LzmaUtil.c
385
386When to use: file->file compressing
387
388**1)** you must implement callback structures for interfaces:
389
390```c
391ISeqInStream
392ISeqOutStream
393ICompressProgress
394ISzAlloc
395
396static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
397static void SzFree(void *p, void *address) { p = p; MyFree(address); }
398static ISzAlloc g_Alloc = { SzAlloc, SzFree };
399
400 CFileSeqInStream inStream;
401 CFileSeqOutStream outStream;
402
403 inStream.funcTable.Read = MyRead;
404 inStream.file = inFile;
405 outStream.funcTable.Write = MyWrite;
406 outStream.file = outFile;
407```
408
409**2)** Create CLzmaEncHandle object;
410
411```c
412 CLzmaEncHandle enc;
413
414 enc = LzmaEnc_Create(&g_Alloc);
415 if (enc == 0)
416 return SZ_ERROR_MEM;
417```
418
419**3)** initialize CLzmaEncProps properties;
420
421```c
422 LzmaEncProps_Init(&props);
423```
424
425 Then you can change some properties in that structure.
426
427**4)** Send LZMA properties to LZMA Encoder
428
429```c
430 res = LzmaEnc_SetProps(enc, &props);
431```
432
433**5)** Write encoded properties to header
434
435```c
436 Byte header[LZMA_PROPS_SIZE + 8];
437 size_t headerSize = LZMA_PROPS_SIZE;
438 UInt64 fileSize;
439 int i;
440
441 res = LzmaEnc_WriteProperties(enc, header, &headerSize);
442 fileSize = MyGetFileLength(inFile);
443 for (i = 0; i < 8; i++)
444 header[headerSize++] = (Byte)(fileSize >> (8 * i));
445 MyWriteFileAndCheck(outFile, header, headerSize)
446```
447
448**6)** Call encoding function:
449
450```c
451 res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
452 NULL, &g_Alloc, &g_Alloc);
453```
454
455**7)** Destroy LZMA Encoder Object
456
457```c
458 LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
459```
460
461If callback function return some error code, LzmaEnc_Encode also returns that code
462or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
463
464---
465
466#### ***Single-call RAM->RAM Compression***
467
468Single-call RAM->RAM Compression is similar to Compression with callbacks,
469but you provide pointers to buffers instead of pointers to stream callbacks:
470
471```c
472SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
473 const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
474 ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
475Return code:
476 SZ_OK - OK
477 SZ_ERROR_MEM - Memory allocation error
478 SZ_ERROR_PARAM - Incorrect paramater
479 SZ_ERROR_OUTPUT_EOF - output buffer overflow
480 SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
481```
482
483Defines
484
485```bash
486_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
487_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
488 - some structures will be doubled in that case.
489_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
490_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
491_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
492```
493
494C++ LZMA Encoder/Decoder
495
496C++ LZMA code use COM-like interfaces. So if you want to use it, you can study basics of COM/OLE.
497
498C++ LZMA code is just wrapper over ANSI-C code.
499
500C++ Notes
501
502If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
503you must check that you correctly work with "new" operator.
504
5057-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
506So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
507
508```cpp
509operator new(size_t size)
510{
511 void *p = ::malloc(size);
512 if (p == 0)
513 throw CNewException();
514 return p;
515}
516```
517
518If you use MSCV that throws exception for "new" operator, you can compile without
519"NewHandler.cpp". So standard exception will be used. Actually some code of
5207-Zip catches any exception in internal code and converts it to HRESULT code.
521So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
522
523### ***Interface Examples:***
524
525Look example code : C/Util/Lzma/LzmaUtil.c
526
527```bash
528 cd C/Util/Lzma
529 make -j -f makefile.gcc
530 output: ./_o/7lzma
531```
532
533```bash
534 LZMA-C 22.01 (x64) : Igor Pavlov : Public domain : 2022-07-15
535
536 Usage: lzma <e|d> inputFile outputFile
537 e: encode file
538 d: decode file
539```
540
541## Contribution
542
543[https://sourceforge.net/p/sevenzip/_list/tickets](https://sourceforge.net/p/sevenzip/_list/tickets)
544
545## Repositories Involved
546
547[**developtools\hiperf**](https://gitee.com/openharmony/developtools_hiperf)
548
README_zh.md
1# third_party_lzma
2
3## 介绍
4
5LZMA 是著名的LZ77压缩算法的改良版本, 最大化地提高了压缩比率, 保持了高压缩速度和解压缩时较低的内存需要。
6
7LZMA2 基于 LZMA, 在压缩过程中提供了更好的多线程支持, 和其他改进优化。
8
97z 是一种数据压缩和文件档案的格式, 是7zip软件的主要文件格式 [**7z官网**](https://www.7-zip.org)。
107z 格式支持不同的压缩方式: LZMA, LZMA2 和其他, 同时也支持基于AES-256的对称加密。
11
12XZ 是一种使用LZMA2数据压缩的文件格式, XZ格式带有额外的特性: SHA/CRC数据校验, 用于提升压缩比率的filters, 拆分blocks和streams。
13
14## 软件架构
15
16软件架构说明
17
18| format/algorithm | C | C++ | C# | Java |
19| :------ | :---------| :----- | :----- | :----- |
20| LZMA 压缩和解压缩 | ✓ | ✓ | ✓ | ✓ |
21| LZMA2 压缩和解压缩 | ✓ | ✓ | | |
22| XZ 压缩和解压缩 | ✓ | ✓ | | |
23| 7Z 解压缩 | ✓ | ✓ | | |
24| 7Z 压缩 | | ✓ | | |
25| small SFXs for installers (7z decompression) | ✓ | | | |
26| SFXs and SFXs for installers (7z decompression) | | ✓ | | |
27
28---
29
30```bash
31/third_party/lzma
32├── Asm # asm files (optimized code for CRC calculation and Intel-AES encryption)
33│ ├── arm
34│ ├── arm64
35│ └── x86
36├── C # C files (compression / decompression and other)
37│ └── Util
38│ ├── 7z # 7z decoder program (decoding 7z files)
39│ ├── Lzma # LZMA program (file->file LZMA encoder/decoder)
40│ ├── LzmaLib # LZMA library (.DLL for Windows)
41│ └── SfxSetup # small SFX module for installers
42├── CPP
43│ ├── Common # common files for C++ projects
44│ ├── Windows # common files for Windows related code
45│ └── 7zip # files related to 7-Zip
46│ ├── Archive # files related to archiving
47│ │ ├── Common # common files for archive handling
48│ │ └── 7z # 7z C++ Encoder/Decoder
49│ ├── Bundles # Modules that are bundles of other modules (files)
50│ │ ├── Alone7z # 7zr.exe: Standalone 7-Zip console program (reduced version)
51│ │ ├── Format7zExtractR # 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.
52│ │ ├── Format7zR # 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2
53│ │ ├── LzmaCon # lzma.exe: LZMA compression/decompression
54│ │ ├── LzmaSpec # example code for LZMA Specification
55│ │ ├── SFXCon # 7zCon.sfx: Console 7z SFX module
56│ │ ├── SFXSetup # 7zS.sfx: 7z SFX module for installers
57│ │ └── SFXWin # 7z.sfx: GUI 7z SFX module
58│ ├── Common # common files for 7-Zip
59│ ├── Compress # files for compression/decompression
60│ ├── Crypto # files for encryption / decompression
61│ └── UI # User Interface files
62│ ├── Client7z # Test application for 7za.dll, 7zr.dll, 7zxr.dll
63│ ├── Common # Common UI files
64│ ├── Console # Code for console program (7z.exe)
65│ ├── Explorer # Some code from 7-Zip Shell extension
66│ ├── FileManager # Some GUI code from 7-Zip File Manager
67│ └── GUI # Some GUI code from 7-Zip
68├── CS
69│ └── 7zip
70│ ├── Common # some common files for 7-Zip
71│ └── Compress # files related to compression/decompression
72│ ├── LZ # files related to LZ (Lempel-Ziv) compression algorithm
73│ ├── LZMA # LZMA compression/decompression
74│ ├── LzmaAlone # file->file LZMA compression/decompression
75│ └── RangeCoder # Range Coder (special code of compression/decompression)
76├── DOC
77│ ├── 7zC.txt # 7z ANSI-C Decoder description
78│ ├── 7zFormat.txt # 7z Format description
79│ ├── installer.txt # information about 7-Zip for installers
80│ ├── lzma-history.txt # history of LZMA SDK
81│ ├── lzma-sdk.txt # LZMA SDK description
82│ ├── lzma-specification.txt # Specification of LZMA
83│ ├── lzma.txt # LZMA compression description
84│ └── Methods.txt # Compression method IDs for .7z
85└── Java
86 └── SevenZip
87 └── Compression # files related to compression/decompression
88 ├── LZ # files related to LZ (Lempel-Ziv) compression algorithm
89 ├── LZMA # LZMA compression/decompression
90 └── RangeCoder # Range Coder (special code of compression/decompression)
91```
92
93## 证书
94
95LZMA SDK is written and placed in the public domain by Igor Pavlov.
96
97Some code in LZMA SDK is based on public domain code from another developers:
98
99 1) PPMd var.H (2001): Dmitry Shkarin
100
101 2) SHA-256: Wei Dai (Crypto++ library)
102
103Anyone is free to copy, modify, publish, use, compile, sell, or distribute the
104original LZMA SDK code, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.
105
106LZMA SDK code is compatible with open source licenses, for example, you can include it to GNU GPL or GNU LGPL code.
107
108## 编译构建
109
110### ***UNIX/Linux***
111
112使用gcc和clang编译7-zip有多种选项,同时7-zip代码中两部分重要的代码: C和汇编。如果与汇编代码一起编译版本,会得到更快的7-zip二进制。7-zip的汇编代码遵循不同平台的语法。
113
114#### *arm64*
115
116gcc和clang arm64版本支持arm64汇编代码语法。
117
118#### *x86 and x86_64(AMD64)*
119
120Asmc Macro Assembler 和 JWasm 在Linux 系统上都支持MASM语法,但JWasm 不支持一些7-zip中使用的cpu指令。
121如果你想编译更快的7zip,必须在Linux上安装Asmc Macro Assembler [https://github.com/nidud/asmc](https://github.com/nidud/asmc)
122
123### ***构建命令***
124
125目录中有两个主要文件用于编译
126 makefile - 使用nmake命令编译Windows版本的7zip
127 makefile.gcc - 使用make命令编译Linux/macOs版本的7zip
128
129首先切换到包含 `makefile.gcc`的目录下:
130
131```bash
132 cd CPP/7zip/Bundles/Alone7z
133```
134
135```bash
136 make -j -f makefile.gcc
137```
138
139另外在"CPP/7zip/"目录下的"*.mak"文件也可以与优化的代码同时编译,并且带有优化选项。比如:
140
141```bash
142 cd CPP/7zip/Bundles/Alone7z
143 make -j -f ../../cmpl_gcc.mak
144```
145
146## **接口使用说明**
147
148这部分描述了C语言实现的LZMA编码和解码函数
149
150注意: 你也可以阅读参考 LZMA Specification (lzma-specification.txt from LZMA SDK)
151
152你也可以查看使用LZMA编码和解码的案例:
153 ***C/Util/Lzma/LzmaUtil.c***
154
155### ***LZMA 压缩的文件格式***
156
157```bash
158Offset Size Description
159 0 1 Special LZMA properties (lc,lp, pb in encoded form)
160 1 4 Dictionary size (little endian)
161 5 8 Uncompressed size (little endian). -1 means unknown size
162 13 Compressed data
163```
164
165ANSI-C(American National Standards Institue) LZMA Decoder
166请注意ANSI-C的接口在LZMA SDK 4.58版本发生了变更,如果你想使用旧的接口,你可以从sourceforge.net 网站下载之前的LZMA SDK版本。
167
168使用 ANSI-C LZMA Decoder需要使用到以下文件:
169
170```bash
171 LzmaDec.h
172 LzmaDec.c
173 7zTypes.h
174 Precomp.h
175 Compiler.h
176```
177
178参考案例: C/Util/Lzma/LzmaUtil.c
179
180LZMA decoding的内存要求
181
1821. LZMA decoding函数局部变量的栈内存不超过200-400字节
183
1842. LZMA Decoder使用字典缓冲区和内部state结构
185
1863. 内部state结构size消耗state_size = (4 + (1.5 << (lc + lp))) KB by default (lc=3, lp=0), state_size = 16 KB.
187
188### ***如何解压缩***
189
190LZMA Decoder (ANSI-C version) 支持以下两种接口:
191
192**1)** 单次调用: LzmaDecode
193
194**2)** 多次调用:LzmaDec_DecodeToBuf(类似于zlib接口)
195
196**你必须自己定义内存分配器:**
197
198Example:
199
200```c
201void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
202void SzFree(void *p, void *address) { p = p; free(address); }
203ISzAlloc alloc = { SzAlloc, SzFree };
204```
205
206You can use p = p; operator to disable compiler warnings.
207
208#### ***单次调用***
209
2101. 使用场景: RAM->RAM decompressing
2112. 编译文件: LzmaDec.h + LzmaDec.c + 7zTypes.h
2123. 编译宏: 不需要
2134. 内存需要:
214
215- Input buffer: compressed size
216- Output buffer: uncompressed size
217- LZMA Internal Structures: state_size (16 KB for default settings)
218
219**Interface:**
220
221```c
222 int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
223 const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
224 ELzmaStatus *status, ISzAlloc *alloc);
225 In:
226 dest - output data
227 destLen - output data size
228 src - input data
229 srcLen - input data size
230 propData - LZMA properties (5 bytes)
231 propSize - size of propData buffer (5 bytes)
232 finishMode - It has meaning only if the decoding reaches output limit (*destLen).
233 LZMA_FINISH_ANY - Decode just destLen bytes.
234 LZMA_FINISH_END - Stream must be finished after (*destLen).
235 You can use LZMA_FINISH_END, when you know that
236 current output buffer covers last bytes of stream.
237 alloc - Memory allocator.
238
239 Out:
240 destLen - processed output size
241 srcLen - processed input size
242
243 Output:
244 SZ_OK
245 status:
246 LZMA_STATUS_FINISHED_WITH_MARK
247 LZMA_STATUS_NOT_FINISHED
248 LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
249 SZ_ERROR_DATA - Data error
250 SZ_ERROR_MEM - Memory allocation error
251 SZ_ERROR_UNSUPPORTED - Unsupported properties
252 SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
253```
254
255如果LZMA decoder 在输出缓冲区上限前到达并看到了end_marker, 返回OK,同时输出的destLen的值会比输出缓冲区的上限小。
256
257你可以在完全解压缩后使用多重检查数据的完整性:
258
259 1. 检查返回值和status变量
260 2. 如果你已知未压缩的数据大小,检查 output(destLen) = uncompressedSize
261 3. 如果你已知压缩后的数据大小,检查 output(srcLen) = compressedSize
262
263#### ***根据状态多次调用 (类似于zlib接口)***
264
2651. 使用场景: file->file decompressing
2662. 编译文件: LzmaDec.h + LzmaDec.c + 7zTypes.h
2673. 内存要求:
268
269- Buffer for input stream: any size (for example, 16 KB)
270- Buffer for output stream: any size (for example, 16 KB)
271- LZMA Internal Structures: state_size (16 KB for default settings)
272- LZMA dictionary (字典大小编码在LZMA properties header中)
273
274使用流程:
275
276**1)** 读取 LZMA properties (5 bytes) and uncompressed size (8 bytes, 小端序) 到 header:
277
278```c
279 unsigned char header[LZMA_PROPS_SIZE + 8];
280 ReadFile(inFile, header, sizeof(header)
281```
282
283**2)** 使用"LZMA properties"分配创建 CLzmaDec(state + dictionary)
284
285```c
286 CLzmaDec state;
287 LzmaDec_Constr(&state);
288 res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
289 if (res != SZ_OK)
290 return res;
291```
292
293**3)** 初始化LzmaDec,在循环中调用LzmaDec_DecodeToBuf
294
295```c
296 LzmaDec_Init(&state);
297 for (;;)
298 {
299 ...
300 int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
301 const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
302 ...
303 }
304```
305
306**4)** 释放所有分配的结构
307
308```c
309 LzmaDec_Free(&state, &g_Alloc);
310```
311
312Look example code:
313 C/Util/Lzma/LzmaUtil.c
314
315### ***如何压缩数据***
316
3171 编译文件:
318
319```bash
320 7zTypes.h
321 Threads.h
322 LzmaEnc.h
323 LzmaEnc.c
324 LzFind.h
325 LzFind.c
326 LzFindMt.h
327 LzFindMt.c
328 LzHash.h
329```
330
3312 内存需要:
332
333- (dictSize * 11.5 + 6 MB) + state_size
334
335Lzma Encoder 可使用两种内存分配器:
336
337- alloc - for small arrays.
338- allocBig - for big arrays.
339
340例如,你可以在allocBig分配器中使用大RAM页(2 MB)来获得更快的压缩速度。需要注意的是Windows对于大RAM页的实现较差。alloc和allocBig也可以使用相同的分配器。
341
342#### ***带有回调的单次压缩***
343
344Look example code:
345 C/Util/Lzma/LzmaUtil.c
346
347使用场景: file->file compressing
348
349**1)** 你必须实现接口的回调函数
350
351```c
352ISeqInStream
353ISeqOutStream
354ICompressProgress
355ISzAlloc
356
357static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
358static void SzFree(void *p, void *address) { p = p; MyFree(address); }
359static ISzAlloc g_Alloc = { SzAlloc, SzFree };
360
361 CFileSeqInStream inStream;
362 CFileSeqOutStream outStream;
363
364 inStream.funcTable.Read = MyRead;
365 inStream.file = inFile;
366 outStream.funcTable.Write = MyWrite;
367 outStream.file = outFile;
368```
369
370**2)** 创建CLzmaEncHandle对象
371
372```c
373 CLzmaEncHandle enc;
374
375 enc = LzmaEnc_Create(&g_Alloc);
376 if (enc == 0)
377 return SZ_ERROR_MEM;
378```
379
380**3)** 初始化CLzmaEncProps属性
381
382```c
383 LzmaEncProps_Init(&props);
384```
385
386之后你可以改变这个结构里的一些属性
387
388**4)** 把上一个步骤设置的属性设置给LZMA Encoder
389
390```c
391 res = LzmaEnc_SetProps(enc, &props);
392```
393
394**5)** 将编码的属性写入header
395
396```c
397 Byte header[LZMA_PROPS_SIZE + 8];
398 size_t headerSize = LZMA_PROPS_SIZE;
399 UInt64 fileSize;
400 int i;
401
402 res = LzmaEnc_WriteProperties(enc, header, &headerSize);
403 fileSize = MyGetFileLength(inFile);
404 for (i = 0; i < 8; i++)
405 header[headerSize++] = (Byte)(fileSize >> (8 * i));
406 MyWriteFileAndCheck(outFile, header, headerSize)
407```
408
409**6)** 调用编码函数
410
411```c
412 res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
413 NULL, &g_Alloc, &g_Alloc);
414```
415
416**7)** 删除LZMA Encoder对象
417
418```c
419 LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
420```
421
422如果回调函数返回某些错误码,LzmaEnc_Encode 也会返回该错误码或者返回类似于SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS。
423
424---
425
426#### ***单次调用 RAM->RAM 压缩***
427
428单次调用,RAM->RAM 压缩与设置回调的方式压缩类似, 但你需要提供指向buffers的指针而不是指向回调函数的指针。
429
430```c
431SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
432 const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
433 ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
434Return code:
435 SZ_OK - OK
436 SZ_ERROR_MEM - Memory allocation error
437 SZ_ERROR_PARAM - Incorrect paramater
438 SZ_ERROR_OUTPUT_EOF - output buffer overflow
439 SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
440```
441
442宏
443
444```c
445_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
446_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
447 - some structures will be doubled in that case.
448_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
449_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
450_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
451```
452
453C++版本的 LZMA Encoder/Decoder
454
455C++版本的 LZMA 代码使用COM-LIKE接口。如果你想使用,可以了解下COM(Component Object Model)/OLE(Object Linking and Embedding)/DDE(Dynamic Data Exchange)的基础。
456
457C++版本的 LZMA 代码部门仅仅只是将ANSI-C代码包装了.
458
459注意:
460如果你使用7zip目录下的C++代码,你必须检查你正确地使用new 运算符
461MSVC 6.0 编译7-zip时,不会抛出 new 运算符的异常。所以7zip在 CPP\Common\NewHandler.cpp 重新定义了new operator
462
463```cpp
464operator new(size_t size)
465{
466 void *p = ::malloc(size);
467 if (p == 0)
468 throw CNewException();
469 return p;
470}
471```
472
473如果你使用的MSCV版本支持new运算符的异常抛出,你在编译7zip时可以忽略"NewHandler.cpp"。
474所以使用标准的异常。实际上7zip的部分代码捕获的任何异常都会转换为HRESULT码。如果你调用7zip的COM interface 就不需要捕获CNewException.
475
476### ***接口案例:***
477
478Look example code : C/Util/Lzma/LzmaUtil.c
479
480```bash
481 cd C/Util/Lzma
482 make -j -f makefile.gcc
483 output: ./_o/7lzma
484```
485
486```bash
487 LZMA-C 22.01 (x64) : Igor Pavlov : Public domain : 2022-07-15
488
489 Usage: lzma <e|d> inputFile outputFile
490 e: encode file
491 d: decode file
492```
493
494## 参与贡献
495
496[https://sourceforge.net/p/sevenzip/_list/tickets](https://sourceforge.net/p/sevenzip/_list/tickets)
497
498## 相关仓
499
500[**developtools\hiperf**](https://gitee.com/openharmony/developtools_hiperf)
501