• Home
Name Date Size #Lines LOC

..--

commentparser/04-Jul-2025-1,2681,090

internal/sets/04-Jul-2025-890577

licenses/04-Jul-2025-24,88519,577

serializer/04-Jul-2025-329245

stringclassifier/04-Jul-2025-3,7552,825

tools/04-Jul-2025-386239

v2/04-Jul-2025-47,85037,761

.gitignoreD04-Jul-202521 21

.travis.ymlD04-Jul-2025391 1514

CHANGELOGD04-Jul-2025613 1513

CONTRIBUTING.mdD04-Jul-2025984 2517

LICENSED04-Jul-202511.1 KiB203169

METADATAD04-Jul-2025724 2119

OWNERSD04-Jul-202551 21

README.mdD04-Jul-20252.3 KiB6746

classifier.goD04-Jul-202514.2 KiB473338

classifier_test.goD04-Jul-202522.7 KiB868780

file_system_resources.goD04-Jul-20251.3 KiB3510

forbidden.goD04-Jul-20251.9 KiB4927

go.modD04-Jul-2025198 107

go.sumD04-Jul-20252.3 KiB2827

license_type.goD04-Jul-202511.7 KiB395333

README.md

1# License Classifier
2
3[![Build status](https://travis-ci.org/google/licenseclassifier.svg?branch=master)](https://travis-ci.org/google/licenseclassifier)
4
5## Introduction
6
7The license classifier is a library and set of tools that can analyze text to
8determine what type of license it contains. It searches for license texts in a
9file and compares them to an archive of known licenses. These files could be,
10e.g., `LICENSE` files with a single or multiple licenses in it, or source code
11files with the license text in a comment.
12
13A "confidence level" is associated with each result indicating how close the
14match was. A confidence level of `1.0` indicates an exact match, while a
15confidence level of `0.0` indicates that no license was able to match the text.
16
17## Usage
18
19### One-time setup
20
21Use the `license_serializer` tool to regenerate the `licenses.db` archive.
22The archive contains preprocessed license texts for quicker comparisons against
23unknown texts.
24
25```shell
26$ go run tools/license_serializer/license_serializer.go -output licenses
27```
28
29### Identifying licenses
30
31Use the `identify_license` command line tool to identify the license(s)
32within a file.
33
34```shell
35$ go run tools/identify_license/identify_license.go /path/to/LICENSE
36LICENSE: GPL-2.0 (confidence: 1, offset: 0, extent: 14794)
37LICENSE: LGPL-2.1 (confidence: 1, offset: 18366, extent: 23829)
38LICENSE: MIT (confidence: 1, offset: 17255, extent: 1059)
39```
40
41## Adding a new license
42
43Adding a new license is straight-forward:
44
451.  Create a file in `licenses/`.
46
47    *   The filename should be the name of the license or its abbreviation. If
48        the license is an Open Source license, use the appropriate identifier
49        specified at https://spdx.org/licenses/.
50    *   If the license is the "header" version of the license, append the suffix
51        "`.header`" to it. See `licenses/README.md` for more details.
52
532.  Add the license name to the list in `license_type.go`.
54
553.  Regenerate the `licenses.db` file by running the license serializer:
56
57    ```shell
58    $ license_serializer -output licenseclassifier/licenses
59    ```
60
614.  Create and run appropriate tests to verify that the license is indeed
62    present.
63
64----
65This is not an official Google product (experimental or otherwise), it is just
66code that happens to be owned by Google.
67